Re^11: Architecture design for full stack development.

Update: Ah, didn't see your update in time. Good, we seem to (more or less) agree then :) /Update

No, you are still mistaken (or, just possibly, do not put forward your point clearly enough).

Let's read what's on the page that you mentioned:

https://www.postgresql.org/docs/current/static/sql-altertable.html

It says:

CLUSTER ON

    This form selects the default index for future CLUSTER operations.
+ It does not actually re-cluster the table.
[download]

So it says: 'future CLUSTER operations'. That means the next CLUSTER command. That is the CLUSTER ON command. On the html-page, that CLUSTER is actually a link into sql-cluster.html. This is the CLUSTER I was talking about all along (naturally, as there is no other CLUSTERing in postgres).

So no, Postgres does not have clustered indexes, except in the sense of the CLUSTER ON command (which re-organises the underlying table on the basis of a /single/ index; and which is non-persistent (as you rightly say))

Comment on Re^11: Architecture design for full stack development. Download Code

Replies are listed 'Best First'.
Re^12: Architecture design for full stack development. by anonymized user 468275 (Curate) on Jun 25, 2017 at 06:55 UTC
Yes and I am grateful to you for putting me right - I could have gone on thinking that the clustered index could be relied upon to persist rather than requiring scheduled cluster operations (e.g. a cron or autosys job) instead. One world, one people	[reply]
Re^13: Architecture design for full stack development. by erix (Prior) on Jun 25, 2017 at 08:43 UTC
... cron or autosys job In view of the 'billions of records' you mentioned earlier, in combination with a busy site: Keep in mind that a table is completely blocked for both read and write during a CLUSTER operation. This often simply means that it is not feasible to use CLUSTER (unless you can suffer the 'downtime'). I'd say, just forget about primary key- (or index-) 'fragmentation' as a problem. It is not, really. It should not be an application level concern. I did a comparison between gapped-values versus contiguous-values tables (starting out with 1 billion rows (=34GB), deleting 90%, then comparing to same-size contiguous-numbered table), and it turns out that indeed, there is no difference in performance. `$ ./testfrag.sh -- rowcount1 [1_000_000_000] -- rowcount2 [100000000] -- gap_size [10] -- running tests... fragmented: average (100 searches): 23.754 ms contiguous: average (100 searches): 26.553 ms` [download] (The difference seen here is coincidence -- after another CLUSTER it might be the other way round) The test was done on my old desktop, and above are cold-cache timings (cached, both go down to ~2.0 ms (again no difference)).	[reply] [d/l]


Think about Loose Coupling
	PerlMonks