lec:adding lec17/18

2018-10-31 17:45:00 -07:00 · 2018-10-31 17:45:00 -07:00 · b5cae05635
commit b5cae05635
parent 873898cf5b
3 changed files with 130 additions and 0 deletions
--- a/cst363/lec/lec16.md
+++ b/cst363/lec/lec16.md
@ -0,0 +1,61 @@
 # lec16
 Let's now go into how we build a utility like the _transaction_ in SQLite.
 ## Problem Statement
 > Why we need or care about transactions.
 If we have tons of users all trying to access a databse to say, reserve a hotel room, we need to make sure that each operation doesn't fail or write over each other.
 Otherwise we're going to have tons of undefined behavior.
 ## A.C.I.D Principles 
 ### Atomicity
 Mneumonically: _all or nothing_
 Either everything in our transaction happens, or none of it happens.
 The reason why we care about this is because we want to be able to _recover_ from problems, like a power outage for instance, or some error which causes a halt.
 To acheive this we have to log everything we're going to do.
 Before we do anything in our transactions, we log what we're going to do, what changes are being made and what those changes are.
 WAL: _write-ahead logging_
 ### Consistency
 Like the name implies we mean to say that our transactions should result in a predictable output everytime.
 ### Isolation
 Transactions should never be able to peek into another transaction.
 As the name implies the transaction runs alone.
 ### Dependability
 Essentially once we reach the end of a transaction we should commit those changes to the database.
 This way if something goes wrong, where the whole database needs to be shutdown, our changes should still be there.
 _Basically this means that we dump anything our transaction buffer onto disk_.
 To achieve this we must verify that the changes were actually committed to the disk.
 ## Serializability
 What we ultimately  want is to be able to operate on multiple transactions while also being able to get the same result as if we had done everything in linear order.
 We want that result because it maintains isolation for each transaction.
 ## Transaction Schedule
 If we have two complex transactions that need to run then we can schedule them in some manner.
 Sometimes it means that we do one transaction first then another, and sometimes it means we do pieces of both in some order.
 The latter is known as _interleaving_.
 Just like individual transactions we can serialize schedules.
 ### More on interleaving
 We mentioned interleaving earlier.
 Basically this just means that we run part of one transaction then another part of a _different_ transaction.
 We only do this if the result of this operation is the same as running them in a serialized fashion.
--- a/cst363/lec/lec17.md
+++ b/cst363/lec/lec17.md
@ -0,0 +1,47 @@
 # lec17
 The previous lecture we covered methods and principles of transactions.
 This time around we'll take care of proper ordering of operations.
 ## Operation Order
 If two\* transactions work on two different data items then we know that they shouldn't collide in their operative results, therefore the order wouldn't matter.
 The order matters if there is a collision between transactions on similar data.
 _Conflict Serializability_ : the ability to swap an interleaved schuedule into a serialized schedule while maintaining the conflict result from the start to the end.
 ## Determining Serializability
 We can go through a schedule where each transaction is placed into a graph as a node.
 We draw edges from each node to another if say we run into a read in transaction-A, followed later on by a opposing write action in another transaction.
 The opposite also applies.
 Our schedule is not serializable if we have a cycle in the resulting graph.
 ## Locks
 Exclusive lock: process locks the database for itself. 
 Shared lock: allows others to put locks on the databse but not exclusive locks
 There are some drawbacks to using locks, especially if done poorly.
 If transaction-a locks some data, say exculsively, but doesn't release the lock before another transaction tries to use that data means that we may end up in a state where everyone is locked out of certain data.
 For this reason we use a special locking protocol to take care of this exact scenario.
 The state where everyone is locked out of something is called a deadlock.
 ### Two-Phase locking Protocol
 The two phases include the _growing_ and _shrinking_ phase.
 This means we are getting more and more locks before we finally release locks until there are none left.
 We don't mix locks and unlocks however, so `[lock lock free lock free free]` isn't valid but `[lock lock lock free free free]` is fine.
 We get two main advantages from this:
 1. Serializability is maintained
 2. Dead locks are easy to find
 Keep in mind however, deadlocks still happen with this protocol.
--- a/cst363/lec/lec18.md
+++ b/cst363/lec/lec18.md
@ -0,0 +1,22 @@
 # lec18
 Using graphs & trees to avoid deadlocks.
 ## Trees
 If we have a tree filled some data that we want to access. 
 With our first accessing into the tree we may lock whichever node we want, however, every subsequent lock after that point must happen _only_ if the parent node to that target is locked.
 The main disadvantage to this methodology is that if we want to access the root node and a leaf node, it means we must do a lot of intermediary locking.
 ## Snapshot Isolation
 For this strategy we're going to scrap the idea that we're going to be using locks, graphs or even trees.
 Instead, when a transaction is about to run, we take a snapshot of everything we're going to modify, then work from there.
 When we commit on the first transaction we'll query to see if anything else has changed the data we're trying to write to.
 If nothing comes up we commit with no issue.
 If something does come up we abort and restart the transaction with a new snapshot, _this time with the new stuff_.
 This time around we should be ok to commit.
 The overhead comes in hard if we have to be correcting transaction but, if we don't find ourselveds do that too much then it beats graphs and trees since there's barely anything to maintain.