) and running vacuum between each run to reclaim the space. You can get around this by running updates on parts of the table at a time (update where pkid between 0000.
POSTGRESQL VS MYSQL UPDATE
The only way to get the space back is to either wait and let vacuum reclaim it over time and reuse it one update at a time, or to run cluster or vacuum full to shrink it back down. Unless HOT can update the rows in place, but then you need a 50% fill factor table which is double sized to begin with. If you add that column, then do an update table set test='abc' it updates each row, and exactly doubles the size of the table. Now we come to the problem with postgresql, and that is the in-heap MVCC storage. Normally it's not a big deal, but when making changes to table structure, you have to wait on vacuums, or kill them.ĭropping a column is just as simple and fast. A vacuum on this table takes a long time to run btw. I killed the vacuum (select pg_cancel_backend(12345) (<- pid in there) and it finished immediately.
When I ran the simple statement on this table: This machine has 32Gigs of RAM, so it cannot hold the table in memory. At 300MB/sec (the speed the array it's on can read) it takes approximately 118*~3seconds to read, or right around 5 minutes. It really should be partitioned but it's not read a whole lot, and when it is we can wait on it. I've got a 118Gigabyte table in my stats db. The very versatility of multiple storage engines means that the lexer / parser / top layer of the DB cannot be as tightly integrated to the storage engines, and therefore a lot of the cool things pgsql can do here mysql can't. In this case none of the different engines in MySQL allow you to easily solve this problem. It limits PostgreSQL to the things it does well (traditional transactional db load handling is a strong point) and not so great at the things that MySQL often fills in the gaps on, like live networked clustered storage with the ndb engine. He will also add that PostgreSQL comes with a lot of great features, such as extensibility and native NoSQL functionality, which helps in managing a complex database. PostgreSQL's very close integration between the various layers means that you can have things like transactional ddl that allow you to roll back anything that isn't an alter / create database / tablespace. Almost any developer will say that MySQL is better suited for websites and online transactions, while PostgreSQL is better suited for large and complex analytical processes. While MySQL's various engines really are amazing for certain corner cases, here none of them help. And the fact is, it doesn't matter how well you designed your app, you WILL have to change the schema on a live database someday. For this problem, PostgreSQL is much much better at handling these types of changes. When the only tool you know is a hammer, all your problems look like a nail.