From 1989680cb17e8db12b8250514c45c353d1ffb748 Mon Sep 17 00:00:00 2001 From: shockrahwow Date: Wed, 3 Oct 2018 13:13:17 -0700 Subject: [PATCH] db: lec1-7 rdy for master --- cst363/lec/lec1.md | 40 +++++++++++++++++++------------ cst363/lec/lec2.md | 60 ++++++++++++++++++++++++---------------------- cst363/lec/lec4.md | 10 ++++---- cst363/lec/lec5.md | 16 +++++++------ cst363/lec/lec6.md | 5 ++-- cst363/lec/lec7.md | 12 ++++------ 6 files changed, 79 insertions(+), 64 deletions(-) diff --git a/cst363/lec/lec1.md b/cst363/lec/lec1.md index b610c11..7f53d79 100644 --- a/cst363/lec/lec1.md +++ b/cst363/lec/lec1.md @@ -1,25 +1,35 @@ # lec1 -## A few reasons to have them \ -And what they *require* \ -Database systems generally need support for: -1. querying - \ - * Finding things \ - * Just as well structured data makes querying easier +## Databases introduction -2. access control - \ - * who can access which data segments and what they can do with that data \ - * reading, writing, sending, etc +First off why do we even need a database and what do they accomplish? -3. corruption prevention - \ - * mirroring/raid/parity checking/checksums/etc as some examples +Generally a databse will have 3 core elements to it: + +1. querying + * Finding things + * Just as well structured data makes querying easier + +2. access control + * who can access which data segments and what they can do with that data + * reading, writing, sending, etc + +3. corruption prevention + * mirroring/raid/parity checking/checksums/etc as some examples + +## Modeling Data -## Modeling Data \ Just like other data problems we can choose what model we use to deal with data. +In the case for sqlite3 the main data model we have are tables, where we store our pertinent data, and later we'll learn even data about our data is stored in tables. -__Schema__ is the deisgn or structure of a specific database. While the __instance__ is the occurance of that schema with some data inside the fields. _The data inside those fields at this point don't really matter. \ +Because everything goes into a table, it means we also have to have a plan for _how_ we want to lay out our data in the table. +The __schema__ is that design/structure for our databse. +The __instance__ is the occurance of that schema with some data inside the fields, i.e. we have a table sitting somewhere in the databse which follows the given structure of a aforemention schema. -__Queries__ are typically known to be declarative; typically we don't care about what goes on behind the scenes in practice since by this point we are assuming we have tools we trust and know to be somewhat efficient. \ +__Queries__ are typically known to be declarative; typically we don't care about what goes on behind the scenes in practice since by this point we are assuming we have tools we trust and know to be somewhat efficient. -__Transactions__ are a set of operations. Transactions are not alllowed to fail. If _anything_ fails then everything should be undone and the state should revert to previous state. +Finally we have __transactions__ which are a set of operations who are not designed to only commit if they are completed successfully. +Transactions are not alllowed to fail. +If _anything_ fails then everything should be undone and the state should revert to previous state. +This is useful because if we are, for example, transferring money to another account we want to make sure that the exchange happens seamlessly otherwise we should back out of the operation altogether. diff --git a/cst363/lec/lec2.md b/cst363/lec/lec2.md index 9a96ebe..23f60c6 100644 --- a/cst363/lec/lec2.md +++ b/cst363/lec/lec2.md @@ -2,61 +2,64 @@ Covering `tables, tuples, data stuff` -## Problem Statement \ +## Problem Statement + We need to be able to manipulate data easily -> IMS had been using trees for a while a long time ago +For sometime previous databses systems, like IMS, had been using tree structures for a while but, there was still a demand for a system that _anyone_ could use. +This issue is what brings us closer to the table structure that we have mentioned in previous lectures. -> Rows --> __atributes__ +To actually guide _how_ we use tables we'll use the followig logic: -> Columns --> __tuple__ +* Rows --> Contain whole entries or records + * all the data in a row is meant to go together + +* Columns --> Individually they are attributes or fields + * Each column is guaranteed to have _only_ 1 type of data in it(e.g. name, title, balance, id\_number) -somtimes we refer to the title of the columns to be fields +* Table --> __relation__ -> Table --> __relation__ +Relational instance as well for another term -relational instance as well for another term - -> Domain - -The set of values allowed in a field +* Domain + * The set of values allowed in a field ## NULL -cant operate on it at all +`NULL` is special, especially in sqlite3 because we aren't allowed to perform operations with it at all. +If we tried to do for example `NULL < 3` then we would just get back NULL; that way we avoid non-deterministic behavior and we are able to parse out bad results later on. +There are a few exceptions to NULL however, where they will be accounted for. -> Count - -The only thing that lets you operate on NULL. Even then you only get 0 back. +* Count + * We only count if there is a row there or not, the data inside does not matter in this context. ## Keys Types -> Super Key +* Super Key +* Candidate Key +* Primary Key -> Candidate Key +### Problem Statement -> Primary Key - -## Problem Statement - -The rows are not distinguishable from each other; we still have a mes of data sitting there unlabeled. Some kind of identifier is necessary to be able to access every tuple in the relational set. +The rows are not distinguishable from each other; we still have a mess of data sitting there unlabeled. Some kind of identifier is necessary to be able to access every tuple in the relational set. ### SuperKey -Set of attr is a superkey for a table as long as that combination of fields remains unique for every tuple in the relational set. -In other words if we have multiple fields; f1 f3 f5 might be a good combo to use as a key into the table. +A set of attributes is a __superkey__ for a table as long as that combination of fields remains unique for every tuple in the relational set. +In other words if we have multiple fields; f1 f3 f5 might be a good combo to use as a key into the table, because it might be able to identify a unique entry in our table. -> What's a valid superkey? +* What's a valid superkey? For starters anything that contains another valid superkey Any subset of a full tuple that can uniquely identify any row in the table. -> Can a whole row be a superkey? -...full on brainlet.........yes +* Can a whole row be a superkey? +As long as it can identify any unique row in a table then it _is_ a superkey for that table. ### Candidate Key Any super key that wouldn't be a superkey if one of the attr were removed. Say then that we have a super key that takes columns {1,3,5,6,7}, but removing anyone of the rows no longer reliably returns an arbitrary _unique_ row. +To put it simply it is the most minimal superkey; though this doesn't entail that there can't be multiple candidate keys for a given table. ### Primary key @@ -64,5 +67,6 @@ Any candidate key the database designer has chosen to serve as the unique ### Foreign Key -Set of attrs in one table that are the primary key attrs of another table. More info about this key type will come later but just know for now that it exists in the wild. +If a table/relation includes among it's attributes the primary key for another relation then it is referred to as a foreign key because that key references another relation. +The table being refferred to is identified as a referenced relation. diff --git a/cst363/lec/lec4.md b/cst363/lec/lec4.md index fbbee57..922fcd1 100644 --- a/cst363/lec/lec4.md +++ b/cst363/lec/lec4.md @@ -1,15 +1,15 @@ # lec4 +This section mostly relies on practicing some of the most basic commands for sqlite3, for that reason most of the content is expressed through practice in the lab sub-section. ## Lab* -This lecture has some lab questions in the `lab/` dircory named `table1.pdf` *and* some example data called `patients.sql`. -`table1.pdf` will have some exercises to learn the basic commands of sqlite and `patients.sql` should have some example data which _table1_ asks you to query. +This lecture has some lab questions in the `lab/` directory named `table1.pdf` *and* some example data called `patients.sql`. +`table1.pdf` will have some exercises to learn the basic commands of sqlite3 and `patients.sql` should have some example data which _table1_ asks you to query. ## Serverless -Instead of having listen server listen for requests to perform actions upon these requests we simply have some databse held on a machina and we perform all of our sql commands on that machine. - -For now we'll be dealing with small test db's so that we can practice the commands and observe each one's behavior; this will give you a good feeling of what does what in sqlite3. +Instead of having listen server listen for requests to perform actions upon these requests we simply have some database held on our own machine and we perform all of our sql commands on that machine. +For now we'll be dealing with small test databases so that we can practice the commands and observe each one's behavior; this will give you a good feeling of what does what in sqlite3. diff --git a/cst363/lec/lec5.md b/cst363/lec/lec5.md index b843dc5..bc8802d 100644 --- a/cst363/lec/lec5.md +++ b/cst363/lec/lec5.md @@ -7,33 +7,35 @@ This lecture will have a lab activity in `cst366/lab/1994-census-summary.sql` wi ## Distinct Values -> Mininum - min(field) +* Mininum - min(field) Finds the smallest value in the given filed -> Maximum - max(field) +* Maximum - max(field) -Find the largest value in th given field +Find the largest value in the given field Say we have a column where we know there are duplicate values but we want to konw what the distinct values in the column may be. SQLite3 has a function for that: `select distinct field, ... from table;` -> select substr(field, startIndex, length) ... +* select substr(field, startIndex, length) ... -_Note_: the start index starts counting at `1` so keep in mind we are offset `+1`. +_Note_: the start index starts counting at `1` so keep in mind we are offset `+1` compared to other language like C. ## Joins Now we want to join to tables together to associate their respective data. +To accomplish this we can perform a simple `join` to combine tables. Important to note that a simple join does not necessarily take care of duplicate fields. If we have duplicate fields we must denote them as `target.field`. -Here target is the table with the desired table and field is the desired field. +Here `target` is the table with the desired table and `field` is the desired field. ## Type Casting -If we have say `56` we can use a cast to turn it into an integer. +If we have say `"56"` we can use a cast to turn it into an integer. > cast(targetString as integer) This will return with an error if a non number character is given as input to the cast function, here in this example we denote it with `targetString`. + diff --git a/cst363/lec/lec6.md b/cst363/lec/lec6.md index e998cf9..4a6cf8b 100644 --- a/cst363/lec/lec6.md +++ b/cst363/lec/lec6.md @@ -4,7 +4,7 @@ This lecture features a lab activity in the lab/ directory named: `courses-ddl.sql` with instructions in `simple-joins-lab.pdf`. -* Note: Just make sure to read int courses-ddl.sql first then courses-small.sql second otherwise there will be random errors.(I'm not taking responsibility for that garbage so don't flame me) +* Note: Just make sure to read in courses-ddl.sql _first_ then courses-small.sql _second_ otherwise there will be random errors.(I'm not taking responsibility for that garbage so don't flame me) ## Natural Joins @@ -15,4 +15,5 @@ Form: ``` select columns_[...] from tableLeft natural join tableRight ``` -While ther is no need to write extra `where` statements there is also the issue where there may be accidental matches since attributes are dropped. +While there is no need to write extra `where` statements there is also the issue where there may be accidental matches since attributes are dropped. +This implies that if two tables have attributes with the same field name, then only one will be returned in the resulting table. diff --git a/cst363/lec/lec7.md b/cst363/lec/lec7.md index fe2b6a8..d9e8f46 100644 --- a/cst363/lec/lec7.md +++ b/cst363/lec/lec7.md @@ -6,14 +6,12 @@ This lecture has two correspondnig lab activities in `lab/` using `1994-census-s ## Null Operations -take the following table as a trivial example of working data +Take the following table as a trivial example of working data -``` -a | b ------ -1 | 2 -3 | N -``` +| a | b | +|---|---| +| 1 | 2 | +| 3 | N | Where `a` and `b` are attributes and N signifiies a NULL value. If we `select a+b from table` we only get back 2 rows like normal but the second row is left empty since we are operating with a NULL value.