bioinformatics

Recall that \(S\) is defined as the set of all RNA strands, nonempty strings made of the bases in \(B = \{\texttt{A},\texttt{U},\texttt{G},\texttt{C}\}\). We define the functions \[\textit{mutation}: S \times \mathbb{Z}^+ \times B \to S \qquad \qquad \textit{insertion}: S \times \mathbb{Z}^+ \times B \to S\] \[\textit{deletion}: \{ s\in S \mid rnalen(s) > 1\} \times \mathbb{Z}^+ \to S \qquad \qquad \textrm{with rules}\]

procedure \(\textit{mutation}\)(\(b_1\cdots b_n\): \(\textrm{a RNA strand}\), \(k\): \(\textrm{a positive integer}\), \(b\): \(\textrm{an element of } B\)) for \(i\) := \(1\) to \(n\) if \(i\) = \(k\) \(c_i\) := \(b\) else \(c_i\) := \(b_i\) return \(c_1\cdots c_n\) \(\{ \textrm{The return value is a RNA strand made of the } c_i \textrm{ values}\}\)

procedure \(\textit{insertion}\)(\(b_1\cdots b_n\): \(\textrm{a RNA strand}\), \(k\): \(\textrm{a positive integer}\), \(b\): \(\textrm{an element of } B\)) if \(k > n\) for \(i\) := \(1\) to \(n\) \(c_i\) := \(b_i\) \(c_{n+1}\) := \(b\) else for \(i\) := \(1\) to \(k-1\) \(c_i\) := \(b_i\) \(c_k\) := \(b\) for \(i\) := \(k+1\) to \(n+1\) \(c_i\) := \(b_{i-1}\) return \(c_1\cdots c_{n+1}\) \(\{ \textrm{The return value is a RNA strand made of the } c_i \textrm{ values}\}\)

procedure \(\textit{deletion}\)(\(b_1\cdots b_n\): \(\textrm{a RNA strand with } n>1\), \(k\): \(\textrm{a positive integer}\)) if \(k > n\) \(m\) := \(n\) for \(i\) := \(1\) to \(n\) \(c_i\) := \(b_i\) else \(m\) := \(n-1\) for \(i\) := \(1\) to \(k-1\) \(c_i\) := \(b_i\) for \(i\) := \(k\) to \(n-1\) \(c_i\) := \(b_{i+1}\) return \(c_1\cdots c_m\) \(\{ \textrm{The return value is a RNA strand made of the } c_i \textrm{ values}\}\)

Alternating quantifiers order rna examples

\[\forall s \in S ~\exists n \in \mathbb{N} ~(~basecount(~(s,\texttt{U})~) = n~)\]

In English: For each strand, there is a nonnnegative integer that counts the number of occurrences of \(\texttt{U}\) in that strand.
\[\exists n \in \mathbb{N} ~\forall s \in S ~(~basecount(~(s,\texttt{U})~) = n~)\]

In English: There is a nonnnegative integer that counts the number of occurrences of \(\texttt{U}\) in every strand.

In English: For each RNA strand there is a base that occurs 3 times in this strand.
Write the negation and use De Morgan’s law to find a logically equivalent version where the negation is applied only to the \(BC\) predicate (not next to a quantifier).

Alternating quantifiers strategies rna examples

Alternating quantifiers proofs rna examples

Hint: first translate the statements to English and identify the main logical structure.

\(\exists s \in S \, (\textit{rnalen(s)} = \textit{basecount}(~(s, \texttt{A})~)\)

\(\forall s \in S \, (\textit{rnalen(s)} \geq \textit{basecount}(~(s, \texttt{A})~))\)

Defining functions more examples

Recall: We model the collection of user ratings of the four movies Dune, Oppenheimer, Barbie, Nimona as the set \(\{-1,0,1\}^4\) . One function that compares pairs of ratings is \[d_0: \{-1,0,1\}^4 \times \{-1,0,1\}^4 \to \mathbb{R}\] given by \[d_0 (~(~ (x_1, x_2, x_3, x_4), (y_1, y_2, y_3, y_4) ~) ~) = \sqrt{ (x_1 - y_1)^2 + (x_2 - y_2)^2 + (x_3 -y_3)^2 + (x_4 -y_4)^2}\]

Notice: there are (at most) \[(3 \cdot 3 \cdot 3 \cdot 3)\cdot (3 \cdot 3 \cdot 3 \cdot 3) = 3^8 = 6561\] many pairs of ratings. There are therefore lots and lots of real numbers that are not the output of \(d_0\).

Recall: RNA is made up of strands of four different bases that encode genomic information in specific ways.
The bases are elements of the set \(B = \{\texttt{A}, \texttt{C}, \texttt{U}, \texttt{G}\}\). The set of RNA strands \(S\) is defined (recursively) by: \[\begin{array}{ll} \textrm{Basis Step: } & \texttt{A}\in S, \texttt{C}\in S, \texttt{U}\in S, \texttt{G}\in S \\ \textrm{Recursive Step: } & \textrm{If } s \in S\textrm{ and }b \in B \textrm{, then }sb \in S \end{array}\] where \(sb\) is string concatenation.

Pro-tip: informal definitions sometime use \(\cdots\) to indicate “continue the pattern”. Often, to make this pattern precise we use recursive definitions.

Name	Domain	Codomain	Rule	Example
\(rnalen\)	\(S\)	\(\mathbb{Z}^+\)	\[\begin{aligned} &\textrm{Basis Step:} \\ &\textrm{If } b \in B\textrm{ then } \textit{rnalen}(b) = 1 \\ &\textrm{Recursive Step:}\\ &\textrm{If } s \in S\textrm{ and } b \in B\textrm{, then }\\ &\textit{rnalen}(sb) = 1 + \textit{rnalen}(s) \end{aligned}\]	\[\begin{aligned} rnalen(\texttt{A}\texttt{C}) &\overset{\text{rec step}}{=} 1 +rnalen(\texttt{A}) \\ &\overset{\text{basis step}}{=} 1 + 1 = 2 \end{aligned}\]
\(basecount\)	\(S \times B\)	\(\mathbb{N}\)	\[\begin{aligned} &\textrm{Basis Step:} \\ &\textrm{If } b_1 \in B, b_2 \in B \textrm{ then} \\ &basecount(~(b_1, b_2)~) = \\ &\begin{cases} 1 & \textrm{when } b_1 = b_2 \\ 0 & \textrm{when } b_1 \neq b_2 \\ \end{cases}\\ &\textrm{Recursive Step:}\\ &\textrm{If } s \in S, b_1 \in B, b_2 \in B\\ &basecount(~(sb_1, b_2)~) = \\ &\begin{cases} 1 + \textit{basecount}(~(s, b_2)~) & \textrm{when } b_1 = b_2 \\ \textit{basecount}(~(s, b_2)~) & \textrm{when } b_1 \neq b_2 \\ \end{cases} \end{aligned}\]	\[\begin{aligned} basecount(~(\texttt{A}\texttt{C}\texttt{U}, \texttt{C})~) = \end{aligned}\]
“\(2\) to the power of”	\(\mathbb{N}\)	\(\mathbb{N}\)	\[\begin{aligned} &\textrm{Basis Step:} \\ &2^0= 1 \\ &\textrm{Recursive Step:}\\ &\textrm{If } n \in \mathbb{N}, 2^{n+1} = \phantom{2 \cdot 2^n} \end{aligned}\]
“\(b\) to the power of \(i\)”	\(\mathbb{Z}^+ \times \mathbb{N}\)	\(\mathbb{N}\)	\[\begin{aligned} &\textrm{Basis Step:} \\ &b^0 = 1 \\ &\textrm{Recursive Step:}\\ &\textrm{If } i \in \mathbb{N}, b^{i+1} = b \cdot b^i \end{aligned}\]

Defining sets

To define a set using roster method, explicitly list its elements. That is, start with \(\{\) then list elements of the set separated by commas and close with \(\}\).

To define a set using set builder definition, either form “The set of all \(x\) from the universe \(U\) such that \(x\) is ..." by writing \[\{x \in U \mid ...x... \}\] or form “the collection of all outputs of some operation when the input ranges over the universe \(U\)" by writing \[\{ ...x... \mid x\in U \}\]

We use the symbol \(\in\) as “is an element of” to indicate membership in a set.

Example sets: For each of the following, identify whether it’s defined using the roster method or set builder notation and give an example element.

Rna motivation

RNA is made up of strands of four different bases that encode genomic information in specific ways.
The bases are elements of the set \(B = \{\texttt{A}, \texttt{C}, \texttt{U}, \texttt{G}\}\). Strands are ordered nonempty finite sequences of bases.

Formally, to define the set of all RNA strands, we need more than roster method or set builder descriptions.

Set recursive examples

Definition The set of nonnegative integers \(\mathbb{N}\) is defined (recursively) by: \[\begin{array}{ll} \textrm{Basis Step: } & \phantom{0 \in \mathbb{N}} \\ \textrm{Recursive Step: } & \phantom{\textrm{If } n \in \mathbb{N} \textrm{, then } n+1 \in \mathbb{N}} \end{array}\]

Definition The set of all integers \(\mathbb{Z}\) is defined (recursively) by: \[\begin{array}{ll} \textrm{Basis Step: } & \phantom{0 \in \mathbb{Z}} \\ \textrm{Recursive Step: } & \phantom{\textrm{If } x \in \mathbb{Z} \textrm{, then } x+1 \in \mathbb{Z} \textrm{ and } x-1 \in \mathbb{Z}} \end{array}\]

Definition The set of RNA strands \(S\) is defined (recursively) by: \[\begin{array}{ll} \textrm{Basis Step: } & \texttt{A}\in S, \texttt{C}\in S, \texttt{U}\in S, \texttt{G}\in S \\ \textrm{Recursive Step: } & \textrm{If } s \in S\textrm{ and }b \in B \textrm{, then }sb \in S \end{array}\] where \(sb\) is string concatenation.

Definition The set of bitstrings (strings of 0s and 1s) is defined (recursively) by: \[\begin{array}{ll} \textrm{Basis Step: } & \phantom{\lambda \in X} \\ \textrm{Recursive Step: } & \phantom{\textrm{If } s \in X \textrm{, then } s0 \in X \text{ and } s1 \in X} \end{array}\]

Notation: We call the set of bitstrings \(\{0,1\}^*\) and we say this is the set of all strings over \(\{0,1\}\).