Real-life representations are often prone to corruption. Biological codes, like RNA, may mutate naturally1 and during measurement; cosmic radiation and other ambient noise can flip bits in computer storage2. One way to recover from corrupted data is to introduce or exploit redundancy.
Consider the following algorithm to introduce redundancy in a string of \(0\)s and \(1\)s.
procedure \(\textit{redun3}\)(\(a_{k-1} \cdots a_0\): a nonempty bitstring) for \(i\) := \(0\) to \(k-1\) \(c_{3i}\) := \(a_i\) \(c_{3i+1}\) := \(a_i\) \(c_{3i+2}\) := \(a_i\) return \(c_{3k-1} \cdots c_0\)
procedure \(\textit{decode3}\)(\(c_{3k-1} \cdots c_0\): a nonempty bitstring whose length is an integer multiple of \(3\)) for \(i\) := \(0\) to \(k-1\) if exactly two or three of \(c_{3i}, c_{3i+1}, c_{3i+2}\) are set to \(1\) \(a_i\) := 1 else \(a_i\) := 0 return \(a_{k-1} \cdots a_0\)
Give a recursive definition of the set of outputs of the \(redun3\) procedure, \(Out\),
Consider the message \(m = 0001\) so that the sender calculates \(redun3(m) = redun3(0001) = 000000000111\).
Introduce \(\underline{\phantom{~~4~~}}\) errors into the message so that the signal received by the receiver is \(\underline{\phantom{010100010101}}\) but the receiver is still able to decode the original message.
Challenge: what is the biggest number of errors you can introduce?
Building a circuit for lines 3-6 in \(decode\) procedure: given three input bits, we need to determine whether the majority is a \(0\) or a \(1\).
2
\(c_{3i}\) | \(c_{3i+1}\) | \(c_{3i+2}\) | \(a_i\) |
---|---|---|---|
\(1\) | \(1\) | \(1\) | \(\phantom{1}\) |
\(1\) | \(1\) | \(0\) | \(\phantom{1}\) |
\(1\) | \(0\) | \(1\) | \(\phantom{1}\) |
\(1\) | \(0\) | \(0\) | \(\phantom{0}\) |
\(0\) | \(1\) | \(1\) | \(\phantom{1}\) |
\(0\) | \(1\) | \(0\) | \(\phantom{0}\) |
\(0\) | \(0\) | \(1\) | \(\phantom{0}\) |
\(0\) | \(0\) | \(0\) | \(\phantom{0}\) |
Circuit
Recall that \(S\) is defined as the set of all RNA strands, nonempty strings made of the bases in \(B = \{\texttt{A},\texttt{U},\texttt{G},\texttt{C}\}\). We define the functions \[\textit{mutation}: S \times \mathbb{Z}^+ \times B \to S \qquad \qquad \textit{insertion}: S \times \mathbb{Z}^+ \times B \to S\] \[\textit{deletion}: \{ s\in S \mid rnalen(s) > 1\} \times \mathbb{Z}^+ \to S \qquad \qquad \textrm{with rules}\]
procedure \(\textit{mutation}\)(\(b_1\cdots b_n\): \(\textrm{a RNA strand}\), \(k\): \(\textrm{a positive integer}\), \(b\): \(\textrm{an element of } B\)) for \(i\) := \(1\) to \(n\) if \(i\) = \(k\) \(c_i\) := \(b\) else \(c_i\) := \(b_i\) return \(c_1\cdots c_n\) \(\{ \textrm{The return value is a RNA strand made of the } c_i \textrm{ values}\}\)
procedure \(\textit{insertion}\)(\(b_1\cdots b_n\): \(\textrm{a RNA strand}\), \(k\): \(\textrm{a positive integer}\), \(b\): \(\textrm{an element of } B\)) if \(k > n\) for \(i\) := \(1\) to \(n\) \(c_i\) := \(b_i\) \(c_{n+1}\) := \(b\) else for \(i\) := \(1\) to \(k-1\) \(c_i\) := \(b_i\) \(c_k\) := \(b\) for \(i\) := \(k+1\) to \(n+1\) \(c_i\) := \(b_{i-1}\) return \(c_1\cdots c_{n+1}\) \(\{ \textrm{The return value is a RNA strand made of the } c_i \textrm{ values}\}\)
procedure \(\textit{deletion}\)(\(b_1\cdots b_n\): \(\textrm{a RNA strand with } n>1\), \(k\): \(\textrm{a positive integer}\)) if \(k > n\) \(m\) := \(n\) for \(i\) := \(1\) to \(n\) \(c_i\) := \(b_i\) else \(m\) := \(n-1\) for \(i\) := \(1\) to \(k-1\) \(c_i\) := \(b_i\) for \(i\) := \(k\) to \(n-1\) \(c_i\) := \(b_{i+1}\) return \(c_1\cdots c_m\) \(\{ \textrm{The return value is a RNA strand made of the } c_i \textrm{ values}\}\)
Algorithms can be expressed in English or in more formalized descriptions like pseudocode or fully executable programs.
Sometimes, we can define algorithms whose output matches the rule for a function we already care about. Consider the (integer) logarithm function \[logb : \{b \in \mathbb{Z} \mid b >1 \} \times \mathbb{Z}^+ ~~\to~~ \mathbb{N}\] defined by \[logb (~ (b,n)~) = \text{greatest integer } y \text{ so that } b^y \text{ is less than or equal to } n\]
procedure \(logb\)(\(b\),\(n\): positive integers with \(b > 1\)) \(i\) := \(0\) while \(n\) > \(b-1\) \(i\) := \(i + 1\) \(n\) := \(n\) div \(b\) return \(i\) \(\{ i\) holds the integer part of the base \(b\) logarithm of \(n\}\)
Trace this algorithm with inputs \(b=3\) and \(n=17\)
\(b\) | \(n\) | \(i\) | \(n > b-1\)? | |
---|---|---|---|---|
Initial value | \(3\) | \(17\) | ||
After 1 iteration | ||||
After 2 iterations | ||||
After 3 iterations | ||||
Compare: does the output match the rule for the (integer) logarithm function?
Two algorithms for constructing base \(b\) expansion from decimal representation
Most significant first: Start with left-most coefficient of expansion (highest value)
Informally: Build up to the value we need to represent in “greedy” approach, using units determined by base.
procedure \(\textit{baseb1}\)(\(n, b\): positive integers with \(b > 1\)) \(v\) := \(n\) \(k\) := \(1 +\) output of \(logb\) algorithm with inputs \(b\) and \(n\) for \(i\) := \(1\) to \(k\) \(a_{k-i}\) := \(0\) while \(v \geq b^{k-i}\) \(a_{k-i}\) := \(a_{k-i} + 1\) \(v\) := \(v - b^{k-i}\) return \((a_{k-1}, \ldots, a_0) \{(a_{k-1} \ldots a_0)_b~\textrm{ is the base } b \textrm{ expansion of } n \}\)
Least significant first: Start with right-most coefficient of expansion (lowest value)
Idea: (when \(k > 1\)) \[\begin{aligned} n &= a_{k-1} b^{k-1} + \cdots + a_1 b + a_0 \\ &= b ( a_{k-1} b^{k-2} + \cdots + a_1) + a_0 \end{aligned}\] so \(a_0 = n \textbf{ mod } b\) and \(a_{k-1} b^{k-2} + \cdots + a_1 = n \textbf{ div } b\).
procedure \(\textit{baseb2}\)(\(n, b\): positive integers with \(b > 1\)) \(q\) := \(n\) \(k\) := \(0\) while \(q \neq 0\) \(a_{k}\) := \(q\) mod \(b\) \(q\) := \(q\) div \(b\) \(k\) := \(k+1\) return \((a_{k-1}, \ldots, a_0) \{(a_{k-1} \ldots a_0)_b~\textrm{ is the base } b \textrm{ expansion of } n \}\)
Practice: write an algorithm for converting from base \(b_1\) expansion to base \(b_2\) expansion:
Mutations of specific RNA codons have been linked to many disorders and cancers.↩︎
This RadioLab podcast episode goes into more detail on bit flips: https://www.wnycstudios.org/story/bit-flip↩︎