csnotes/cst363/lec/lec15.md
2018-10-22 17:36:23 -07:00

45 lines
2.0 KiB
Markdown

# lec15
This lecture has two corresponding lab exercises `lab/hashing-lab.pdf` and `lab/other-operations.pdf`.
## Deleted Data on Disk
Let's say we did the following operations on our disk:
```
insert data1
insert data2
delete data1
lookup data2
```
Let's say that when we inserted data2 with a hash function there was a collision with data1.
In sequential storage we would normal try to put data2 right after data1.
If we try to lookup data2 through our hash function we would again land at data1 so we would have to search linarly for data2.
Now let's suppose that data1 is deleted.
If we lookup data2 again we would still land at data1's location but this time there's no collision, ergo, there's no linar correction to reach data2.
This is why when something is deleted on disk we don't actually delete things.
Instead we simply _mark_ or _flag_ a block for deletion.
This means we still get a collision so that we can linearly correct for data2.
The other side to this is that if we do another insert that collides with data1's location we are allowed to overwrite that data because it has been marked for deletion.
## 'where' clause
Let's say we have the following query:
```
... where condition or other_condition;
```
By default the database will try to optimize the query by effectively replacing the query with its own version of the same query but tailored specifically for that task.
We can also use `and`'s with the `where` clause which the databse must also evaluate to create a more efficient query.
## Pages in Memory
If we have a large table that won't fit into memory we can partition that table so that when we push it into memory it fits in our pages.
We can do a first pass where we sort individual partitions in the memory pages.
This first pass will temporarily write our sorted paritions to the disk where we can then gladitorially sort the partitions against eacch other, writing the result to some output.
The previous temporary files from earlier can then be marked for deletion.