When adding a new column and assigning values to it, you may be tempted to write something like: (I'm using postgres)
-- Add cost column ALTER TABLE products add COLUMN cost DECIMAL; update products set cost = 0.00; ALTER TABLE products ALTER COLUMN cost SET NOT NULL;
These three queries are not the fastest way of doing this. It requires several passes through the table on disk, the first time it adds the column, assigning null to each row. The next time it goes through them all changing the value to be "0.00". It then sets the table to be not-null.
It can be done a lot faster by assigning the default value at the same time as adding the column.
-- Add cost column ALTER TABLE products add COLUMN cost DECIMAL NOT NULL DEFAULT 0.00; ALTER TABLE products ALTER COLUMN cost DROP DEFAULT;
the second line here can even be omitted if you don't mind the default still being there, not that it takes a long time.
Similarly I saw recently one where all the existing data had to have a different value from the new default. This was implemented as:
ALTER TABLE products ADD COLUMN new_flag BOOLEAN DEFAULT false; UPDATE products SET new_flag=true;
It was made significantly faster by changing it to
ALTER TABLE products ADD COLUMN new_flag BOOLEAN DEFAULT true; ALTER TABLE products ALTER COLUMN new_flag SET DEFAULT false;
which does the same thing, but only scanning through the table once.
Why doesn't the update command in SQL support an "order by" clause?
Logically you'd think "That'll be because the order doesn't matter" - SQL is all mathematical set operations. Unfortunately it does.
Think of the case (that you shouldn't do), of altering the primary key of a table:
update my_table set id = id + 1;
Think of this as being like Hilbert's Hotel - if there were no collisions before running it (and nothing added to it whilst the update runs) then there should be no collisions afterwards.
Unfortunately you do get an error indicating a violation of the primary key uniqueness constraint. Obviously it sets the id of the row which used to have id=1 to be id=2 then checks for the constraint at this point there is already a row with id=2 (that we're intending to make id=3). If we updated them in order from the highest id first, then there wouldn't be any problem as we would free up a "room" before trying to put anyone else in it.