db: lec1-7 rdy for master

This commit is contained in:
shockrahwow 2018-10-03 13:13:17 -07:00
parent ff69cfae11
commit 1989680cb1
6 changed files with 79 additions and 64 deletions

View File

@ -1,25 +1,35 @@
# lec1 # lec1
## A few reasons to have them \ ## Databases introduction
And what they *require* \
Database systems generally need support for:
1. querying - \
* Finding things \
* Just as well structured data makes querying easier
2. access control - \ First off why do we even need a database and what do they accomplish?
* who can access which data segments and what they can do with that data \
* reading, writing, sending, etc
3. corruption prevention - \ Generally a databse will have 3 core elements to it:
* mirroring/raid/parity checking/checksums/etc as some examples
1. querying
* Finding things
* Just as well structured data makes querying easier
2. access control
* who can access which data segments and what they can do with that data
* reading, writing, sending, etc
3. corruption prevention
* mirroring/raid/parity checking/checksums/etc as some examples
## Modeling Data
## Modeling Data \
Just like other data problems we can choose what model we use to deal with data. Just like other data problems we can choose what model we use to deal with data.
In the case for sqlite3 the main data model we have are tables, where we store our pertinent data, and later we'll learn even data about our data is stored in tables.
__Schema__ is the deisgn or structure of a specific database. While the __instance__ is the occurance of that schema with some data inside the fields. _The data inside those fields at this point don't really matter. \ Because everything goes into a table, it means we also have to have a plan for _how_ we want to lay out our data in the table.
The __schema__ is that design/structure for our databse.
The __instance__ is the occurance of that schema with some data inside the fields, i.e. we have a table sitting somewhere in the databse which follows the given structure of a aforemention schema.
__Queries__ are typically known to be declarative; typically we don't care about what goes on behind the scenes in practice since by this point we are assuming we have tools we trust and know to be somewhat efficient. \ __Queries__ are typically known to be declarative; typically we don't care about what goes on behind the scenes in practice since by this point we are assuming we have tools we trust and know to be somewhat efficient.
__Transactions__ are a set of operations. Transactions are not alllowed to fail. If _anything_ fails then everything should be undone and the state should revert to previous state. Finally we have __transactions__ which are a set of operations who are not designed to only commit if they are completed successfully.
Transactions are not alllowed to fail.
If _anything_ fails then everything should be undone and the state should revert to previous state.
This is useful because if we are, for example, transferring money to another account we want to make sure that the exchange happens seamlessly otherwise we should back out of the operation altogether.

View File

@ -2,61 +2,64 @@
Covering `tables, tuples, data stuff` Covering `tables, tuples, data stuff`
## Problem Statement \ ## Problem Statement
We need to be able to manipulate data easily We need to be able to manipulate data easily
> IMS had been using trees for a while a long time ago For sometime previous databses systems, like IMS, had been using tree structures for a while but, there was still a demand for a system that _anyone_ could use.
This issue is what brings us closer to the table structure that we have mentioned in previous lectures.
> Rows --> __atributes__ To actually guide _how_ we use tables we'll use the followig logic:
> Columns --> __tuple__ * Rows --> Contain whole entries or records
* all the data in a row is meant to go together
* Columns --> Individually they are attributes or fields
* Each column is guaranteed to have _only_ 1 type of data in it(e.g. name, title, balance, id\_number)
somtimes we refer to the title of the columns to be fields * Table --> __relation__
> Table --> __relation__ Relational instance as well for another term
relational instance as well for another term * Domain
* The set of values allowed in a field
> Domain
The set of values allowed in a field
## NULL ## NULL
cant operate on it at all `NULL` is special, especially in sqlite3 because we aren't allowed to perform operations with it at all.
If we tried to do for example `NULL < 3` then we would just get back NULL; that way we avoid non-deterministic behavior and we are able to parse out bad results later on.
There are a few exceptions to NULL however, where they will be accounted for.
> Count * Count
* We only count if there is a row there or not, the data inside does not matter in this context.
The only thing that lets you operate on NULL. Even then you only get 0 back.
## Keys Types ## Keys Types
> Super Key * Super Key
* Candidate Key
* Primary Key
> Candidate Key ### Problem Statement
> Primary Key The rows are not distinguishable from each other; we still have a mess of data sitting there unlabeled. Some kind of identifier is necessary to be able to access every tuple in the relational set.
## Problem Statement
The rows are not distinguishable from each other; we still have a mes of data sitting there unlabeled. Some kind of identifier is necessary to be able to access every tuple in the relational set.
### SuperKey ### SuperKey
Set of attr is a superkey for a table as long as that combination of fields remains unique for every tuple in the relational set. A set of attributes is a __superkey__ for a table as long as that combination of fields remains unique for every tuple in the relational set.
In other words if we have multiple fields; f1 f3 f5 might be a good combo to use as a key into the table. In other words if we have multiple fields; f1 f3 f5 might be a good combo to use as a key into the table, because it might be able to identify a unique entry in our table.
> What's a valid superkey? * What's a valid superkey?
For starters anything that contains another valid superkey For starters anything that contains another valid superkey
Any subset of a full tuple that can uniquely identify any row in the table. Any subset of a full tuple that can uniquely identify any row in the table.
> Can a whole row be a superkey? * Can a whole row be a superkey?
...full on brainlet.........yes As long as it can identify any unique row in a table then it _is_ a superkey for that table.
### Candidate Key ### Candidate Key
Any super key that wouldn't be a superkey if one of the attr were removed. Say then that we have a super key that takes columns {1,3,5,6,7}, but removing anyone of the rows no longer reliably returns an arbitrary _unique_ row. Any super key that wouldn't be a superkey if one of the attr were removed. Say then that we have a super key that takes columns {1,3,5,6,7}, but removing anyone of the rows no longer reliably returns an arbitrary _unique_ row.
To put it simply it is the most minimal superkey; though this doesn't entail that there can't be multiple candidate keys for a given table.
### Primary key ### Primary key
@ -64,5 +67,6 @@ Any candidate key the database designer has chosen to serve as the unique
### Foreign Key ### Foreign Key
Set of attrs in one table that are the primary key attrs of another table. More info about this key type will come later but just know for now that it exists in the wild. If a table/relation includes among it's attributes the primary key for another relation then it is referred to as a foreign key because that key references another relation.
The table being refferred to is identified as a referenced relation.

View File

@ -1,15 +1,15 @@
# lec4 # lec4
This section mostly relies on practicing some of the most basic commands for sqlite3, for that reason most of the content is expressed through practice in the lab sub-section.
## Lab* ## Lab*
This lecture has some lab questions in the `lab/` dircory named `table1.pdf` *and* some example data called `patients.sql`. This lecture has some lab questions in the `lab/` directory named `table1.pdf` *and* some example data called `patients.sql`.
`table1.pdf` will have some exercises to learn the basic commands of sqlite and `patients.sql` should have some example data which _table1_ asks you to query. `table1.pdf` will have some exercises to learn the basic commands of sqlite3 and `patients.sql` should have some example data which _table1_ asks you to query.
## Serverless ## Serverless
Instead of having listen server listen for requests to perform actions upon these requests we simply have some databse held on a machina and we perform all of our sql commands on that machine. Instead of having listen server listen for requests to perform actions upon these requests we simply have some database held on our own machine and we perform all of our sql commands on that machine.
For now we'll be dealing with small test databases so that we can practice the commands and observe each one's behavior; this will give you a good feeling of what does what in sqlite3.
For now we'll be dealing with small test db's so that we can practice the commands and observe each one's behavior; this will give you a good feeling of what does what in sqlite3.

View File

@ -7,33 +7,35 @@ This lecture will have a lab activity in `cst366/lab/1994-census-summary.sql` wi
## Distinct Values ## Distinct Values
> Mininum - min(field) * Mininum - min(field)
Finds the smallest value in the given filed Finds the smallest value in the given filed
> Maximum - max(field) * Maximum - max(field)
Find the largest value in th given field Find the largest value in the given field
Say we have a column where we know there are duplicate values but we want to konw what the distinct values in the column may be. Say we have a column where we know there are duplicate values but we want to konw what the distinct values in the column may be.
SQLite3 has a function for that: `select distinct field, ... from table;` SQLite3 has a function for that: `select distinct field, ... from table;`
> select substr(field, startIndex, length) ... * select substr(field, startIndex, length) ...
_Note_: the start index starts counting at `1` so keep in mind we are offset `+1`. _Note_: the start index starts counting at `1` so keep in mind we are offset `+1` compared to other language like C.
## Joins ## Joins
Now we want to join to tables together to associate their respective data. Now we want to join to tables together to associate their respective data.
To accomplish this we can perform a simple `join` to combine tables.
Important to note that a simple join does not necessarily take care of duplicate fields. Important to note that a simple join does not necessarily take care of duplicate fields.
If we have duplicate fields we must denote them as `target.field`. If we have duplicate fields we must denote them as `target.field`.
Here target is the table with the desired table and field is the desired field. Here `target` is the table with the desired table and `field` is the desired field.
## Type Casting ## Type Casting
If we have say `56` we can use a cast to turn it into an integer. If we have say `"56"` we can use a cast to turn it into an integer.
> cast(targetString as integer) > cast(targetString as integer)
This will return with an error if a non number character is given as input to the cast function, here in this example we denote it with `targetString`. This will return with an error if a non number character is given as input to the cast function, here in this example we denote it with `targetString`.

View File

@ -4,7 +4,7 @@
This lecture features a lab activity in the lab/ directory named: `courses-ddl.sql` with instructions in `simple-joins-lab.pdf`. This lecture features a lab activity in the lab/ directory named: `courses-ddl.sql` with instructions in `simple-joins-lab.pdf`.
* Note: Just make sure to read int courses-ddl.sql first then courses-small.sql second otherwise there will be random errors.(I'm not taking responsibility for that garbage so don't flame me) * Note: Just make sure to read in courses-ddl.sql _first_ then courses-small.sql _second_ otherwise there will be random errors.(I'm not taking responsibility for that garbage so don't flame me)
## Natural Joins ## Natural Joins
@ -15,4 +15,5 @@ Form:
``` ```
select columns_[...] from tableLeft natural join tableRight select columns_[...] from tableLeft natural join tableRight
``` ```
While ther is no need to write extra `where` statements there is also the issue where there may be accidental matches since attributes are dropped. While there is no need to write extra `where` statements there is also the issue where there may be accidental matches since attributes are dropped.
This implies that if two tables have attributes with the same field name, then only one will be returned in the resulting table.

View File

@ -6,14 +6,12 @@ This lecture has two correspondnig lab activities in `lab/` using `1994-census-s
## Null Operations ## Null Operations
take the following table as a trivial example of working data Take the following table as a trivial example of working data
``` | a | b |
a | b |---|---|
----- | 1 | 2 |
1 | 2 | 3 | N |
3 | N
```
Where `a` and `b` are attributes and N signifiies a NULL value. Where `a` and `b` are attributes and N signifiies a NULL value.
If we `select a+b from table` we only get back 2 rows like normal but the second row is left empty since we are operating with a NULL value. If we `select a+b from table` we only get back 2 rows like normal but the second row is left empty since we are operating with a NULL value.