db: lec15 w/ respective lab stuff
This commit is contained in:
parent
8512bcf22d
commit
36baaf0dc2
BIN
cst363/lab/hashing-lab.pdf
Normal file
BIN
cst363/lab/hashing-lab.pdf
Normal file
Binary file not shown.
BIN
cst363/lab/other-operations-lab.pdf
Normal file
BIN
cst363/lab/other-operations-lab.pdf
Normal file
Binary file not shown.
44
cst363/lec/lec15.md
Normal file
44
cst363/lec/lec15.md
Normal file
@ -0,0 +1,44 @@
|
|||||||
|
# lec15
|
||||||
|
|
||||||
|
This lecture has two corresponding lab exercises `lab/hashing-lab.pdf` and `lab/other-operations.pdf`.
|
||||||
|
|
||||||
|
## Deleted Data on Disk
|
||||||
|
|
||||||
|
Let's say we did the following operations on our disk:
|
||||||
|
|
||||||
|
```
|
||||||
|
insert data1
|
||||||
|
insert data2
|
||||||
|
delete data1
|
||||||
|
lookup data2
|
||||||
|
```
|
||||||
|
|
||||||
|
Let's say that when we inserted data2 with a hash function there was a collision with data1.
|
||||||
|
In sequential storage we would normal try to put data2 right after data1.
|
||||||
|
If we try to lookup data2 through our hash function we would again land at data1 so we would have to search linarly for data2.
|
||||||
|
Now let's suppose that data1 is deleted.
|
||||||
|
If we lookup data2 again we would still land at data1's location but this time there's no collision, ergo, there's no linar correction to reach data2.
|
||||||
|
This is why when something is deleted on disk we don't actually delete things.
|
||||||
|
Instead we simply _mark_ or _flag_ a block for deletion.
|
||||||
|
This means we still get a collision so that we can linearly correct for data2.
|
||||||
|
|
||||||
|
The other side to this is that if we do another insert that collides with data1's location we are allowed to overwrite that data because it has been marked for deletion.
|
||||||
|
|
||||||
|
## 'where' clause
|
||||||
|
|
||||||
|
Let's say we have the following query:
|
||||||
|
|
||||||
|
```
|
||||||
|
... where condition or other_condition;
|
||||||
|
```
|
||||||
|
|
||||||
|
By default the database will try to optimize the query by effectively replacing the query with its own version of the same query but tailored specifically for that task.
|
||||||
|
|
||||||
|
We can also use `and`'s with the `where` clause which the databse must also evaluate to create a more efficient query.
|
||||||
|
|
||||||
|
## Pages in Memory
|
||||||
|
|
||||||
|
If we have a large table that won't fit into memory we can partition that table so that when we push it into memory it fits in our pages.
|
||||||
|
We can do a first pass where we sort individual partitions in the memory pages.
|
||||||
|
This first pass will temporarily write our sorted paritions to the disk where we can then gladitorially sort the partitions against eacch other, writing the result to some output.
|
||||||
|
The previous temporary files from earlier can then be marked for deletion.
|
Loading…
Reference in New Issue
Block a user