csnotes/cst363/lec/lec2.md

# lec2

Covering `tables, tuples, data stuff`

## Problem Statement

We need to be able to manipulate data easily

For sometime previous databses systems, like IMS, had been using tree structures for a while but, there was still a demand for a system that _anyone_ could use.
This issue is what brings us closer to the table structure that we have mentioned in previous lectures.

To actually guide _how_ we use tables we'll use the followig logic:

* Rows --> Contain whole entries or records
	* all the data in a row is meant to go together

* Columns --> Individually they are attributes or fields
	* Each column is guaranteed to have _only_ 1 type of data in it(e.g. name, title, balance, id\_number)

* Table --> __relation__

Relational instance as well for another term

* Domain
	* The set of values allowed in a field

## NULL

`NULL` is special, especially in sqlite3 because we aren't allowed to perform operations with it at all.
If we tried to do for example `NULL < 3` then we would just get back NULL; that way we avoid non-deterministic behavior and we are able to parse out bad results later on.
There are a few exceptions to NULL however, where they will be accounted for.

* Count
	* We only count if there is a row there or not, the data inside does not matter in this context.

## Keys Types

* Super Key
* Candidate Key
* Primary Key

### Problem Statement

The rows are not distinguishable from each other; we still have a mess of data sitting there unlabeled. Some kind of identifier is necessary to be able to access every tuple in the relational set.

### SuperKey

A set of attributes is a __superkey__ for a table as long as that combination of fields remains unique for every tuple in the relational set.
In other words if we have multiple fields; f1 f3 f5 might be a good  combo to use as a key into the table, because it might be able to identify a unique entry in our table.

* What's a valid superkey?

For starters anything that contains another valid superkey
Any subset of a full tuple that can uniquely identify any row in the table.

* Can a whole row be a superkey?
As long as it can identify any unique row in a table then it _is_ a superkey for that table.

### Candidate Key

Any super key that wouldn't be a superkey if one of the attr were removed. Say then that we have a super key that takes columns {1,3,5,6,7}, but removing anyone of the rows no longer reliably returns an arbitrary _unique_ row.
To put it simply it is the most minimal superkey; though this doesn't entail that there can't be multiple candidate keys for a given table.

### Primary key

Any candidate key the database designer has chosen to serve as the unique

### Foreign Key

If a table/relation includes among it's attributes the primary key for another relation then it is referred to as a foreign key because that key references another relation.
The table being refferred to is identified as a referenced relation.