db: lec 1-10 caught up

This commit is contained in:
shockrahwow 2018-10-03 13:19:05 -07:00
parent e6759cac92
commit b1a45702f4
15 changed files with 276 additions and 57 deletions

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
cst363/lab/triggers-lab.pdf Normal file

Binary file not shown.

BIN
cst363/lab/views-lab.pdf Normal file

Binary file not shown.

View File

@ -1,25 +1,35 @@
# lec1
## A few reasons to have them \
And what they *require* \
Database systems generally need support for:
1. querying - \
* Finding things \
* Just as well structured data makes querying easier
## Databases introduction
2. access control - \
* who can access which data segments and what they can do with that data \
* reading, writing, sending, etc
First off why do we even need a database and what do they accomplish?
3. corruption prevention - \
* mirroring/raid/parity checking/checksums/etc as some examples
Generally a databse will have 3 core elements to it:
1. querying
* Finding things
* Just as well structured data makes querying easier
2. access control
* who can access which data segments and what they can do with that data
* reading, writing, sending, etc
3. corruption prevention
* mirroring/raid/parity checking/checksums/etc as some examples
## Modeling Data
## Modeling Data \
Just like other data problems we can choose what model we use to deal with data.
In the case for sqlite3 the main data model we have are tables, where we store our pertinent data, and later we'll learn even data about our data is stored in tables.
__Schema__ is the deisgn or structure of a specific database. While the __instance__ is the occurance of that schema with some data inside the fields. _The data inside those fields at this point don't really matter. \
Because everything goes into a table, it means we also have to have a plan for _how_ we want to lay out our data in the table.
The __schema__ is that design/structure for our databse.
The __instance__ is the occurance of that schema with some data inside the fields, i.e. we have a table sitting somewhere in the databse which follows the given structure of a aforemention schema.
__Queries__ are typically known to be declarative; typically we don't care about what goes on behind the scenes in practice since by this point we are assuming we have tools we trust and know to be somewhat efficient. \
__Queries__ are typically known to be declarative; typically we don't care about what goes on behind the scenes in practice since by this point we are assuming we have tools we trust and know to be somewhat efficient.
__Transactions__ are a set of operations. Transactions are not alllowed to fail. If _anything_ fails then everything should be undone and the state should revert to previous state.
Finally we have __transactions__ which are a set of operations who are not designed to only commit if they are completed successfully.
Transactions are not alllowed to fail.
If _anything_ fails then everything should be undone and the state should revert to previous state.
This is useful because if we are, for example, transferring money to another account we want to make sure that the exchange happens seamlessly otherwise we should back out of the operation altogether.

43
cst363/lec/lec10.md Normal file
View File

@ -0,0 +1,43 @@
# lec10
This lecture has a corresponding lab excercise who's instructions can be found in `triggers-lab.pdf`.
## What is a trigger
Something that executes when _some operation_ is performed
## Structure
```
create trigger NAME before some_operation
when(condition)
begin
do_something
end;
```
To explain: First we `create trigger` followed by some trigger name.
Then we have to denote that this trigger should fire whenever some operation happens.
This trigger then executes everything in the `begin...end;` section _before_ the new operation happens.
> `after`
Likewise if we want to fire a trigger _after_ some operation we ccan just replace the before keyword with `after`.
> `new.adsf`
Refers to _new_ value being added to a table.
> `old.adsf`
Refers to _old_ vvalue being changed in a table.
## Trigger Metadata
If you want to look at what triggers exist you can query the `sql_master` table.
```
select * from sql_master where name='trigger';
```

View File

@ -2,61 +2,64 @@
Covering `tables, tuples, data stuff`
## Problem Statement \
## Problem Statement
We need to be able to manipulate data easily
> IMS had been using trees for a while a long time ago
For sometime previous databses systems, like IMS, had been using tree structures for a while but, there was still a demand for a system that _anyone_ could use.
This issue is what brings us closer to the table structure that we have mentioned in previous lectures.
> Rows --> __atributes__
To actually guide _how_ we use tables we'll use the followig logic:
> Columns --> __tuple__
* Rows --> Contain whole entries or records
* all the data in a row is meant to go together
* Columns --> Individually they are attributes or fields
* Each column is guaranteed to have _only_ 1 type of data in it(e.g. name, title, balance, id\_number)
somtimes we refer to the title of the columns to be fields
* Table --> __relation__
> Table --> __relation__
Relational instance as well for another term
relational instance as well for another term
> Domain
The set of values allowed in a field
* Domain
* The set of values allowed in a field
## NULL
cant operate on it at all
`NULL` is special, especially in sqlite3 because we aren't allowed to perform operations with it at all.
If we tried to do for example `NULL < 3` then we would just get back NULL; that way we avoid non-deterministic behavior and we are able to parse out bad results later on.
There are a few exceptions to NULL however, where they will be accounted for.
> Count
The only thing that lets you operate on NULL. Even then you only get 0 back.
* Count
* We only count if there is a row there or not, the data inside does not matter in this context.
## Keys Types
> Super Key
* Super Key
* Candidate Key
* Primary Key
> Candidate Key
### Problem Statement
> Primary Key
## Problem Statement
The rows are not distinguishable from each other; we still have a mes of data sitting there unlabeled. Some kind of identifier is necessary to be able to access every tuple in the relational set.
The rows are not distinguishable from each other; we still have a mess of data sitting there unlabeled. Some kind of identifier is necessary to be able to access every tuple in the relational set.
### SuperKey
Set of attr is a superkey for a table as long as that combination of fields remains unique for every tuple in the relational set.
In other words if we have multiple fields; f1 f3 f5 might be a good combo to use as a key into the table.
A set of attributes is a __superkey__ for a table as long as that combination of fields remains unique for every tuple in the relational set.
In other words if we have multiple fields; f1 f3 f5 might be a good combo to use as a key into the table, because it might be able to identify a unique entry in our table.
> What's a valid superkey?
* What's a valid superkey?
For starters anything that contains another valid superkey
Any subset of a full tuple that can uniquely identify any row in the table.
> Can a whole row be a superkey?
...full on brainlet.........yes
* Can a whole row be a superkey?
As long as it can identify any unique row in a table then it _is_ a superkey for that table.
### Candidate Key
Any super key that wouldn't be a superkey if one of the attr were removed. Say then that we have a super key that takes columns {1,3,5,6,7}, but removing anyone of the rows no longer reliably returns an arbitrary _unique_ row.
To put it simply it is the most minimal superkey; though this doesn't entail that there can't be multiple candidate keys for a given table.
### Primary key
@ -64,5 +67,6 @@ Any candidate key the database designer has chosen to serve as the unique
### Foreign Key
Set of attrs in one table that are the primary key attrs of another table. More info about this key type will come later but just know for now that it exists in the wild.
If a table/relation includes among it's attributes the primary key for another relation then it is referred to as a foreign key because that key references another relation.
The table being refferred to is identified as a referenced relation.

View File

@ -1,15 +1,15 @@
# lec4
This section mostly relies on practicing some of the most basic commands for sqlite3, for that reason most of the content is expressed through practice in the lab sub-section.
## Lab*
This lecture has some lab questions in the `lab/` dircory named `table1.pdf` *and* some example data called `patients.sql`.
`table1.pdf` will have some exercises to learn the basic commands of sqlite and `patients.sql` should have some example data which _table1_ asks you to query.
This lecture has some lab questions in the `lab/` directory named `table1.pdf` *and* some example data called `patients.sql`.
`table1.pdf` will have some exercises to learn the basic commands of sqlite3 and `patients.sql` should have some example data which _table1_ asks you to query.
## Serverless
Instead of having listen server listen for requests to perform actions upon these requests we simply have some databse held on a machina and we perform all of our sql commands on that machine.
For now we'll be dealing with small test db's so that we can practice the commands and observe each one's behavior; this will give you a good feeling of what does what in sqlite3.
Instead of having listen server listen for requests to perform actions upon these requests we simply have some database held on our own machine and we perform all of our sql commands on that machine.
For now we'll be dealing with small test databases so that we can practice the commands and observe each one's behavior; this will give you a good feeling of what does what in sqlite3.

View File

@ -7,33 +7,35 @@ This lecture will have a lab activity in `cst366/lab/1994-census-summary.sql` wi
## Distinct Values
> Mininum - min(field)
* Mininum - min(field)
Finds the smallest value in the given filed
> Maximum - max(field)
* Maximum - max(field)
Find the largest value in th given field
Find the largest value in the given field
Say we have a column where we know there are duplicate values but we want to konw what the distinct values in the column may be.
SQLite3 has a function for that: `select distinct field, ... from table;`
> select substr(field, startIndex, length) ...
* select substr(field, startIndex, length) ...
_Note_: the start index starts counting at `1` so keep in mind we are offset `+1`.
_Note_: the start index starts counting at `1` so keep in mind we are offset `+1` compared to other language like C.
## Joins
Now we want to join to tables together to associate their respective data.
To accomplish this we can perform a simple `join` to combine tables.
Important to note that a simple join does not necessarily take care of duplicate fields.
If we have duplicate fields we must denote them as `target.field`.
Here target is the table with the desired table and field is the desired field.
Here `target` is the table with the desired table and `field` is the desired field.
## Type Casting
If we have say `56` we can use a cast to turn it into an integer.
If we have say `"56"` we can use a cast to turn it into an integer.
> cast(targetString as integer)
This will return with an error if a non number character is given as input to the cast function, here in this example we denote it with `targetString`.

View File

@ -4,7 +4,7 @@
This lecture features a lab activity in the lab/ directory named: `courses-ddl.sql` with instructions in `simple-joins-lab.pdf`.
* Note: Just make sure to read int courses-ddl.sql first then courses-small.sql second otherwise there will be random errors.(I'm not taking responsibility for that garbage so don't flame me)
* Note: Just make sure to read in courses-ddl.sql _first_ then courses-small.sql _second_ otherwise there will be random errors.(I'm not taking responsibility for that garbage so don't flame me)
## Natural Joins
@ -15,4 +15,5 @@ Form:
```
select columns_[...] from tableLeft natural join tableRight
```
While ther is no need to write extra `where` statements there is also the issue where there may be accidental matches since attributes are dropped.
While there is no need to write extra `where` statements there is also the issue where there may be accidental matches since attributes are dropped.
This implies that if two tables have attributes with the same field name, then only one will be returned in the resulting table.

45
cst363/lec/lec7.md Normal file
View File

@ -0,0 +1,45 @@
# lec7
## Lab Activity
This lecture has two correspondnig lab activities in `lab/` using `1994-census-summary.sql` with instrucctions on `aggregation-lab.pdf` and `nested-subqueries-lab.pdf`.
## Null Operations
Take the following table as a trivial example of working data
| a | b |
|---|---|
| 1 | 2 |
| 3 | N |
Where `a` and `b` are attributes and N signifiies a NULL value.
If we `select a+b from table` we only get back 2 rows like normal but the second row is left empty since we are operating with a NULL value.
Even if we use multiplication or some kind of comparison against null we simply ignore that row since NULL in sqlite3 doesn't mean 0.
Instead NULL in sqlite3 actually represents something that doesn't exist.
> count will treat NULL as 0 however
This is the only exception to the _ignore NULL_ "rule".
## Aggregation
This section we'll deal with functions similar to `count average min max`.
We call these functions _aggreagate_ functions because they aggregate multiple data points into one.
> round(integer)
Rounds off the floating point number to some level of precision.
> group by _attr_
This will group attributes to gether in the result of a query
> having(attribute)
Similar to `where` however we only care about group scope in this case.
## Nested Subqueries
Recall that when we perform a query the result is a table.
We can leverage this and perform some query to query a resultant table to further our ability to filter results from a table.

60
cst363/lec/lec8.md Normal file
View File

@ -0,0 +1,60 @@
# lec8
## Lab
The lab exercises for this lecture can found under `lab/` as `db-mods-transactions-lab.pdf`.
DB Modifications, plus transactions
## Modifyinig Data
Since we're dealing with data we may need to add, delete or modify entries in some table.
When we have inserted data before we have done simple insertions like in the previous lab exercises `insert into tableName values(...);`.
Where the arguments are listed in the same order as they are listed in the table structure.
However, we can pass arguments by name, elminating the need to provide them in a rigid order:
```
insert into tableName(list, of, attributes) values('respective', 'data', 'entries');
```
We can also move things from one table into another table.
```
insert into targetTable select ... from hostTable;
```
### Deleting
```
delete from tableName where ...;
```
Deletes a _whole row_.
Caution: the delete operation also accepts tables as valid arguments so a query that returns multiple rows as a table will be deleted in the `targetTable` mentioned earlier.
### Updating entries
```
update table set attribute=123 where def='abc';
```
The above updates an attribute based on the condiftion `where def='abc'`.
## Transactions
Set of instructions which upon failure do not modify any state.
```
begin;
// set of commands
// wew
end;
```
## Inner/Outer Joins
> left (left outer)
_the outer part is implied so it's unnecessary to write it in_

54
cst363/lec/lec9.md Normal file
View File

@ -0,0 +1,54 @@
# lec9
## Lab
This lecture has a corresponding lab activity in `lab/`, the instructions are named `views-lab.pdf` and the second one is `contraints-lab.pdf`.
## Views
```
create view newTabelName as select ... from targetTable;
```
This will create a `view` which whenever it is queried will pull data from some base table.
Really the `view` is a kind of "_macro_" which is stored in a `catalog` that normal, non-admin users can use to access a database.
The catalog is saved in a table somewhere in the database.
Think of this catalog like a container(_table_) for the other tables in the database.
### Pros & Cons
Problems:
* Computing the view multiple times can be expensive
* Maintainence
There are two strategies to dealing with the second item: eager and lazy strategies.
1. Eager
* If the target table of some view changes the update the view immediately
2. Lazy
* Don't update the view until it is needed(_queried_)
## Check Contraint
Checks values when they are inserted to validate their legitimacy.
```
create table blah(
id varchar(8) check (id like "%-%"),
);
```
This is how we can avoid accidently putting in null or downright logically incorrect data into a table.
We can also require entries be unique as well.
```
create table blah (
dept_name varchar(20),
...
unique(dept_name)
);
```
_KEEP IN MIND HOWEVER_. With `unique()` if we try to check if a new entry is unique it will always fail with NULL since operations with NULL results in false.
That means we will be able to insert NULL values into the table even if they are not unique.