Crash Consistency

fsck

Superblock: Sanity Check:

  1. Check the system size > the number of blocks that have been allocated.
  2. We want to find a suspected superblock.

Free Blocks:

Ensure inodes in use ar emorked in inode bitmaps. Update the inode bitmap if needed.

Inode state: check inode fields for possible corruption

Like check a vaild mode field.

Duplicates: check if two different inodes refer to same block.

Clear One if obviously bad, or give each inode its own copy of block

Bad blocks: bad pointers (like outside of valid range)

Remove.

Directory checks: Integrity of directory structure

Such as, . and .. must be the first two entries

Journalling

Transaction structure

Huffman Coding

Total Encoding Length means the total number of bits needed for the “prefix codes”

ABL(Averyage Number of bits per letter) is simply TEL / n (n is the total number of letters).

An optimal prefix codes is that with the smallest ABL.

Definition: A binary tree is full if each internal node has 2 children.

Clamin: The binary tree corresponding to the optimal prefix code is full.

Proof:

Let K be an optimal prefix code. Assume there is a T such that T is a binary tree corresponding to an optimal prefix code K and T is not full – T has a node u that has exactly one child v.

  1. If u is the root, we can just make a new tree T’ by deleting u and making v the root.
  2. If u is not root, we can make a new tree T
    1. by deleting u and mking v the child of w.

And it is true that, by either case, the number of bits needed decreases, and we can stilll make them intact for each other.

Let K’ be such prefix code. We have ABL(K’) < ABL(K). It is against the fact that K is an optimal prefix code. So, the assumption that there is a T such that T is a binary tree corresponding to an optimal prefix code K and T is not full – T has a node u that has exactly one child v, is false.

But how to find such a tree?

Corollary: There is an optimal binary tree T*, in which the two lowest frequency letters are assigned to leaves that are siblinsg.

Here comes the Huffman Algorithm

function HuffmanEncoding(S, freq)
  if S == {a, b}
    y(a) = 0
    y(b) = 1
  else
    Let y* and z* be the two lowest frequency letters
    Form a new alphabet S' = S - (y*, z*) + w, with freq(w) = freq(y*) + freq(z*)
    y' = HuffmanEncoding(S', freq)
    Extend y' to a prefix code y for S as follows:
      y(y*) = y'(w).append(0)
      y(z*) = y'(w).append(1)
  return y

Dynamic Programming: Summary to some concepts after reading CLRS chapter 15

What is dynamic programming

Dynamic programming is like divide-and-conquer method. It also solves problems by combining the solutions to subproblems.

But, dynamic programming applies when the sub-problems overlap (sub-problems share sub-sub-problems, in DAC, sub-problems are disjoint.)

Obivously, when sub-problems share sub-sub-problems, DAC does more work than necessary (not good).

What will dynamic programming (I will call it DC followingly) do? It saves the solutions to the sub-sub-problems in a table so that the solutions can be reused everytime it solves the same sub-sub-problems.

DC are usually applied to Optimization Problems – we want a solution with Optimal Value, either minimal or maximal. Note that the solutions may not be unique so we call it “an optimal solution” instead of “the optimal solution”.

Steps to develop a dynamic-programming algorithm

  1. Characterize the structure of an optimal solution. ???
  2. Recursively define the value of an optimal solution. ???
  3. Compute the value of an optimal solution, typically in a bottom-up fashion. ???
  4. Construct an optimal solution from computed information (only needed when we want the solution itself). ???

Some examples to explain the steps

Rot Cutting

What we want

a way to cut a rod into rods of smaller length that maximizes their total value.

Context

Serling Enterprises buys long steel rods and cuts them into shorter rods, which it then sells. Each cut is free. The management of Serling Enterprises wants to know the best way to cut up the rods.

Although typically, the larger the length is, the higher the price is, it doesn’t mean we don’t need any cutting to get higher revenue. See the following price table:

Length12345678910
Price1589101717202430
Sample Price Table

For example, for a rod of length 4. Cutting it into two 2-length smaller rods gives us the maximal revenue: 10. (We won’t know it unless we list all the ways to cut and compute each revenue.)

We can have a general solution to the optimal value.

r_n = max(p_n, r_1 + r_{n-1}, r_2 + r_{n-2}, ..., r_{n-1} + r_1)

It is recursion. For every r_i + r_{n-1}, we mean at the first cutting, we cut it into sub-rods of length i and n-1. And obviously, the arguments to the MAX function have all the possible cases of the first cutting, and so we can get the optimal value after comparing all the sum of the optimal values of the sub-rod and the value of no-cutting-at-all.

When we computer the optimal values of the sub-rods, we may have the sub-problems which we have solved. For example, we will need the optimal values of rod of length 2, when we want the optimal values of both 3-length rod and 4-length rod. If we stored the solutions to the optimal values of 2-length rod, we don’t need to solve the problem (solution of optimal value of 2-length rod.)

Or even simpler

r_n = max(p_i + r_{n-i})

This, we only computes the optimal value of the right-hand remaining sub-rod. Why it works? Because we can view the left-hand sub-rod as part of the optimal solution.

The Algorithm
Not-efficient One
CUT-ROD(p, n)
  if n == 0:
     return 0
  q = -99999999999
  for i = 0 to n:
    q = max(q, p[i] + CUT-ROD(p[n-i]))
  return q

It is not efficient because we call CUT-ROD every time.

Efficitent One (Thanks to Dynamic Programming)

We store the solutions in memory. So, when meeting the same problem, we don’t need to recompute the solutions. It is an exmaple of time-temory de-off.

There are to ways to implement dynamic programming:

top-down with memoization (memorization)

it remembers what results it has computed previously.

MEMORIZED-CUT-ROD(p, n)
  let r[0, ..., n] be a new array
  for i = 0 to n
    r[i] = -9999999999999999999
  return MEMORIZED-CUT-ROD-AUX(p, n, r)

MEMORIZED-CUT-ROD-AUX(p, n, r)
  if(r[n] >= 0)  //has been solved before
    return r[n]
  if(n == 0)
    return 0
  else
    q = -9999999999999999999
      for i = 1 to n
        q = max(p, p[i] + MEMORIZED-CUT-ROD-AUX(p, n - i, r))
  r[n] = q
  return q

You can say the only difference between this version and the original CUT-ROD is the we “memorize” the optimal values.

bottom-up method

We solve each subproblem only once, and when wefi rst see it, we have already solved all of its prerequisite subproblems.

BOTTOM-UP-CUT-ROD(p, n)
  let r[0, ..., n] be a new array
  r[0] = 0
  for j = 1 to n
    q = -99999
    for i = 1 to j //solve the smaller problems first
      q = max(q, p[i] + r[j - i])
    r[j] = q
  return r[n] // r[n] is what we want
Subproblem graphs

It helps us understand the set of subproblems involved and how subproblems depend on each other.

Like this one

subproblem graph.
Restructing a solution
EXTENDED-BOTTOM-UP-CUT-ROD(p, n)
  let r[0, ..., n] and s[0, ..., n] be new arrays.
  r[0] = 0
  for j = 1 to n
    q = -99999
    for i = 1 to j //solve the smaller problems first
      if(q < p[i] + r[j - i]) //Found a better solution
        q = max(q, p[i] + r[j - i])
        s[j] = i //We should cut the left-hand-size to i at the first cut!
    r[j] = q
  return r and s // r[n] is what we want

PRINT-CUT-ROD-SOLUTION(p, n)
  (r, s) = EXTENDED-BOTTOM-UP-CUT-ROD(p, n)
  while n > 0
    print s[n]
    n = n - s[n]
  1. Why return r and s: because here we want a solution. So we have to return the whole s to know what the first cutting is for each size (sub-problem).
  2. Why n – s[n]: this is because for a rod of size n, the optimal value is reached when the first cutting is s[n], after first cutting, the remaining rod has size n – s[n].

How to Change SSH Port in Ubuntu 20.04

In this article, I will show you how to change SSH Port in Ubuntu 20.04 for your server’s security. I will make 43030 the new SSH Port.

Open New SSH Port

Please note that it is crucial to know if you have Firewall installed (in most cases, I would say “yes”), you must add a rule to allow new SSH port! Otherwise, you won’t be able to use SSH to connect to your Ubuntu server.

You can do so by running the following command:

sudo ufw allow 43030/tcp

Change SSH Port by modifying sshd_config

First, we are going to open sshd_config file with nano.

nano /etc/ssh/sshd_config

You will find a line of codes:

#Port 22

Change the code to the following (unmute and change port):

Port 43030

And then we are going to restart SSH server so that the new config is in effective:

systemctl restart sshd

Connect to Your Server with the new port

You can run the following command with option p to connect to your server:

ssh [email protected]_id_or_hostname -p 43030

You Can Now Use Your Face to Play the Classical Snake Game

Have you ever been addicted to Snake?

Now, with the help of a TensorFlow model called PoseNet, you can get a totally different experience with this classical game. That’s because:

You can now use your face to control the snake!

Have a try here: https://www.panchen.xyz/game/snake

Acknowledgements:

  1. To the PoseNet team who makes the core feature of my face control snake game possible.
  2. To the unknown person who shares his / her snake game so that I could just modify the codes to make my game.

236 Exam

Sad

(Aside format test)

You will see “Location”, “Weather” and “Mood”.

Gandi.net will house panchen.xyz since August 22

Alibaba Cloud is currently housing my primary domain name – PanChen.xyz while I have made a decision to find a new home for it and it is called Gandi.net, a French company. And the transfer will happen on August 22.

– Why did I decide to move my domain name away from Alibaba Cloud to Gandi.net?

  1. Gandi.net provides a free domain name email service to all domains registered there. And I am enjoying this service with one of my domain name and the experience is great. And such a service is not provided by the majority of the registries (most of them charge money).
  2. The price of registering, renewing and transferring isn’t “cheap” at Gandi.net. But I renewed my PanChen.xyz for 10 years during a promotion with the total cost of around $20. So even the transferring costs $15 and only adds one year to my domain name’s registration period. It is still a great deal since I can enjoy what Gandi.net provides for its customer, particularly the free domain email, for at least 10 years once I transfer my domain to them.

3. Gandi.net is a French company and that means my personal information might be protected in a relatively better way.

– Why won’t I move PanChen.xyz until August 22?

Money matters! As I mentioned before, PanChen.xyz now has a ten-year registration period which also means that the registration period can not be extended according to the domain name’s policy (This policy, as I know, applies to many well-known TLDs, such as .com, .org, .net …). So I have to wait for August 22, when the registration period becomes 9 years so that after I pay Gandi.net the transfer fee, the registration period can be extended to 10 years again automatically.