Page MenuHomeSoftware Heritage

Performance analysis of read queries
Closed, MigratedEdits Locked

Description

Analysis of the read performance for swh_content_find_directory, swh_revision_find_occurrence, with the help of Dimitri.

Event Timeline

olasd claimed this task.
olasd raised the priority of this task from to High.
olasd updated the task description. (Show Details)
olasd added a subscriber: olasd.

swh_content_find_directory query :

explain (analyze, buffers, verbose) with recursive path as (
-- Recursively build a path from the requested content to a root
-- directory. Each iteration returns a pair (dir_id, filename) where
-- filename is relative to dir_id. Stops when no parent directory can
-- be found.                                                         
(select dir.id as dir_id, dir_entry_f.name as name, 0 as depth
 from directory_entry_file as dir_entry_f                     
 join content on content.sha1_git = dir_entry_f.target
 join directory as dir on dir.file_entries @> array[dir_entry_f.id]
 where content.sha1 = '\xe516d03ee598741abfd2ab156562665348629735'
 limit 1)
union all
(select dir.id as dir_id,
(dir_entry_d.name || '/' || path.name)::unix_path as name,
path.depth + 1
 from path
 join directory_entry_dir as dir_entry_d on dir_entry_d.target = path.dir_id
 join directory as dir on dir.dir_entries @> array[dir_entry_d.id]
 limit 1)
    )
    select dir_id, name from path order by depth desc limit 1;

enable_seqscan = on : http://explain.depesz.com/s/9Mv

                                                                                        QUERY PLAN                                                                                        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=91.31..91.31 rows=1 width=68) (actual time=931302.242..931302.245 rows=1 loops=1)
   Output: path.dir_id, path.name, path.depth
   Buffers: shared hit=5087403 read=15089353, temp read=2399387 written=2399389
   CTE path
     ->  Recursive Union  (cost=5.69..91.03 rows=11 width=64) (actual time=275544.524..931302.143 rows=3 loops=1)
           Buffers: shared hit=5087400 read=15089353, temp read=2399387 written=2399389
           ->  Subquery Scan on "*SELECT* 1"  (cost=5.69..16.45 rows=1 width=38) (actual time=275544.516..275544.518 rows=1 loops=1)
                 Output: "*SELECT* 1".dir_id, "*SELECT* 1".name, 0
                 Buffers: shared hit=3302077 read=6916568
                 ->  Limit  (cost=5.69..16.44 rows=1 width=38) (actual time=275544.513..275544.514 rows=1 loops=1)
                       Output: dir.id, dir_entry_f.name, (0)
                       Buffers: shared hit=3302077 read=6916568
                       ->  Nested Loop  (cost=5.69..12295582.17 rows=1142992 width=38) (actual time=275544.510..275544.510 rows=1 loops=1)
                             Output: dir.id, dir_entry_f.name, 0
                             Join Filter: (dir.file_entries @> ARRAY[dir_entry_f.id])
                             Rows Removed by Join Filter: 205523161
                             Buffers: shared hit=3302077 read=6916568
                             ->  Nested Loop  (cost=5.69..230.17 rows=1 width=25) (actual time=7.270..7.270 rows=1 loops=1)
                                   Output: dir_entry_f.name, dir_entry_f.id
                                   Buffers: shared hit=8 read=4
                                   ->  Index Scan using content_pkey on public.content  (cost=0.57..8.59 rows=1 width=21) (actual time=0.043..0.043 rows=1 loops=1)
                                         Output: content.sha1, content.sha1_git, content.sha256, content.length, content.ctime, content.status
                                         Index Cond: ((content.sha1)::bytea = '\xe516d03ee598741abfd2ab156562665348629735'::bytea)
                                         Buffers: shared hit=6
                                   ->  Bitmap Heap Scan on public.directory_entry_file dir_entry_f  (cost=5.11..221.04 rows=54 width=46) (actual time=7.216..7.216 rows=1 loops=1)
                                         Output: dir_entry_f.name, dir_entry_f.target, dir_entry_f.id
                                         Recheck Cond: ((dir_entry_f.target)::bytea = (content.sha1_git)::bytea)
                                         Heap Blocks: exact=1
                                         Buffers: shared hit=2 read=4
                                         ->  Bitmap Index Scan on directory_entry_file_target_name_perms_idx  (cost=0.00..5.10 rows=54 width=0) (actual time=7.153..7.153 rows=1 loops=1)
                                               Index Cond: ((dir_entry_f.target)::bytea = (content.sha1_git)::bytea)
                                               Buffers: shared hit=2 read=3
                             ->  Seq Scan on public.directory dir  (cost=0.00..9437872.00 rows=228598400 width=163) (actual time=0.031..79390.386 rows=205523162 loops=1)
                                   Output: dir.id, dir.dir_entries, dir.file_entries, dir.rev_entries
                                   Buffers: shared read=6435573
           ->  Limit  (cost=0.33..7.44 rows=1 width=64) (actual time=218585.812..218585.812 rows=1 loops=3)
                 Output: dir_1.id, (((((dir_entry_d.name)::bytea || '\x2f'::bytea) || (path_1.name)::bytea))::unix_path), ((path_1.depth + 1))
                 Buffers: shared hit=1785323 read=8172785, temp read=2399387 written=2399389
                 ->  Nested Loop  (cost=0.33..170706177.85 rows=24002832 width=64) (actual time=218585.805..218585.805 rows=1 loops=3)
                       Output: dir_1.id, ((((dir_entry_d.name)::bytea || '\x2f'::bytea) || (path_1.name)::bytea))::unix_path, (path_1.depth + 1)
                       Join Filter: (dir_1.dir_entries @> ARRAY[dir_entry_d.id])
                       Rows Removed by Join Filter: 137019104
                       Buffers: shared hit=1785323 read=8172785, temp read=2399387 written=2399389
                       ->  Hash Join  (cost=0.33..4123320.61 rows=21 width=51) (actual time=37369.078..37369.078 rows=1 loops=3)
                             Output: path_1.name, path_1.depth, dir_entry_d.name, dir_entry_d.id
                             Hash Cond: ((dir_entry_d.target)::bytea = (path_1.dir_id)::bytea)
                             Buffers: shared hit=46 read=1613775
                             ->  Seq Scan on public.directory_entry_dir dir_entry_d  (cost=0.00..3438898.69 rows=182512369 width=36) (actual time=0.025..23930.048 rows=60452382 loops=3)
                                   Output: dir_entry_d.name, dir_entry_d.target, dir_entry_d.id
                                   Buffers: shared hit=46 read=1613775
                             ->  Hash  (cost=0.20..0.20 rows=10 width=68) (actual time=0.007..0.007 rows=1 loops=3)
                                   Output: path_1.name, path_1.depth, path_1.dir_id
                                   Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                   ->  WorkTable Scan on path path_1  (cost=0.00..0.20 rows=10 width=68) (actual time=0.003..0.003 rows=1 loops=3)
                                         Output: path_1.name, path_1.depth, path_1.dir_id
                       ->  Materialize  (cost=0.00..14599196.00 rows=228598400 width=114) (actual time=0.033..145769.217 rows=205528657 loops=2)
                             Output: dir_1.id, dir_1.dir_entries
                             Buffers: shared hit=1065 read=6434606, temp read=2399387 written=2399389
                             ->  Seq Scan on public.directory dir_1  (cost=0.00..9437872.00 rows=228598400 width=114) (actual time=0.021..102911.372 rows=205528806 loops=1)
                                   Output: dir_1.id, dir_1.dir_entries
                                   Buffers: shared hit=1065 read=6434606
   ->  Sort  (cost=0.28..0.30 rows=11 width=68) (actual time=931302.239..931302.239 rows=1 loops=1)
         Output: path.dir_id, path.name, path.depth
         Sort Key: path.depth
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=5087403 read=15089353, temp read=2399387 written=2399389
         ->  CTE Scan on path  (cost=0.00..0.22 rows=11 width=68) (actual time=275544.531..931302.162 rows=3 loops=1)
               Output: path.dir_id, path.name, path.depth
               Buffers: shared hit=5087400 read=15089353, temp read=2399387 written=2399389
 Planning time: 1.489 ms
 Execution time: 937179.160 ms
(71 rows)

enable_seqscan = off : http://explain.depesz.com/s/RhW

                                                                                        QUERY PLAN                                                                                        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=97493.07..97493.07 rows=1 width=68) (actual time=20.643..20.643 rows=1 loops=1)
   Output: path.dir_id, path.name, path.depth
   Buffers: shared hit=23 read=21
   CTE path
     ->  Recursive Union  (cost=8863.94..97492.79 rows=11 width=64) (actual time=6.564..20.597 rows=3 loops=1)
           Buffers: shared hit=23 read=21
           ->  Subquery Scan on "*SELECT* 1"  (cost=8863.94..8863.97 rows=1 width=38) (actual time=6.559..6.561 rows=1 loops=1)
                 Output: "*SELECT* 1".dir_id, "*SELECT* 1".name, 0
                 Buffers: shared hit=14 read=4
                 ->  Limit  (cost=8863.94..8863.96 rows=1 width=38) (actual time=6.557..6.558 rows=1 loops=1)
                       Output: dir.id, dir_entry_f.name, (0)
                       Buffers: shared hit=14 read=4
                       ->  Nested Loop  (cost=8863.94..34809.74 rows=1142992 width=38) (actual time=6.556..6.556 rows=1 loops=1)
                             Output: dir.id, dir_entry_f.name, 0
                             Buffers: shared hit=14 read=4
                             ->  Nested Loop  (cost=5.69..230.17 rows=1 width=25) (actual time=0.105..0.105 rows=1 loops=1)
                                   Output: dir_entry_f.name, dir_entry_f.id
                                   Buffers: shared hit=12
                                   ->  Index Scan using content_pkey on public.content  (cost=0.57..8.59 rows=1 width=21) (actual time=0.049..0.049 rows=1 loops=1)
                                         Output: content.sha1, content.sha1_git, content.sha256, content.length, content.ctime, content.status
                                         Index Cond: ((content.sha1)::bytea = '\xe516d03ee598741abfd2ab156562665348629735'::bytea)
                                         Buffers: shared hit=6
                                   ->  Bitmap Heap Scan on public.directory_entry_file dir_entry_f  (cost=5.11..221.04 rows=54 width=46) (actual time=0.044..0.044 rows=1 loops=1)
                                         Output: dir_entry_f.name, dir_entry_f.target, dir_entry_f.id
                                         Recheck Cond: ((dir_entry_f.target)::bytea = (content.sha1_git)::bytea)
                                         Heap Blocks: exact=1
                                         Buffers: shared hit=6
                                         ->  Bitmap Index Scan on directory_entry_file_target_name_perms_idx  (cost=0.00..5.10 rows=54 width=0) (actual time=0.030..0.030 rows=1 loops=1)
                                               Index Cond: ((dir_entry_f.target)::bytea = (content.sha1_git)::bytea)
                                               Buffers: shared hit=5
                             ->  Bitmap Heap Scan on public.directory dir  (cost=8858.25..23149.65 rows=1142992 width=163) (actual time=6.445..6.445 rows=1 loops=1)
                                   Output: dir.id, dir.dir_entries, dir.file_entries, dir.rev_entries
                                   Recheck Cond: (dir.file_entries @> ARRAY[dir_entry_f.id])
                                   Heap Blocks: exact=1
                                   Buffers: shared hit=2 read=4
                                   ->  Bitmap Index Scan on directory_file_entries_idx  (cost=0.00..8572.50 rows=1142992 width=0) (actual time=6.416..6.416 rows=11 loops=1)
                                         Index Cond: (dir.file_entries @> ARRAY[dir_entry_f.id])
                                         Buffers: shared hit=1 read=4
           ->  Limit  (cost=8862.82..8862.86 rows=1 width=64) (actual time=4.670..4.671 rows=1 loops=3)
                 Output: dir_1.id, (((((dir_entry_d.name)::bytea || '\x2f'::bytea) || (path_1.name)::bytea))::unix_path), ((path_1.depth + 1))
                 Buffers: shared hit=9 read=17
                 ->  Nested Loop  (cost=8862.82..906318.40 rows=24002832 width=64) (actual time=4.668..4.668 rows=1 loops=3)
                       Output: dir_1.id, ((((dir_entry_d.name)::bytea || '\x2f'::bytea) || (path_1.name)::bytea))::unix_path, (path_1.depth + 1)
                       Buffers: shared hit=9 read=17
                       ->  Nested Loop  (cost=4.59..126.44 rows=21 width=51) (actual time=4.018..4.018 rows=1 loops=3)
                             Output: path_1.name, path_1.depth, dir_entry_d.name, dir_entry_d.id
                             Buffers: shared hit=2 read=12
                             ->  WorkTable Scan on path path_1  (cost=0.00..0.20 rows=10 width=68) (actual time=0.002..0.002 rows=1 loops=3)
                                   Output: path_1.dir_id, path_1.name, path_1.depth
                             ->  Bitmap Heap Scan on public.directory_entry_dir dir_entry_d  (cost=4.59..12.60 rows=2 width=36) (actual time=4.007..4.007 rows=1 loops=3)
                                   Output: dir_entry_d.name, dir_entry_d.target, dir_entry_d.id
                                   Recheck Cond: ((dir_entry_d.target)::bytea = (path_1.dir_id)::bytea)
                                   Heap Blocks: exact=2
                                   Buffers: shared hit=2 read=12
                                   ->  Bitmap Index Scan on directory_entry_dir_target_name_perms_idx  (cost=0.00..4.58 rows=2 width=0) (actual time=3.978..3.978 rows=1 loops=3)
                                         Index Cond: ((dir_entry_d.target)::bytea = (path_1.dir_id)::bytea)
                                         Buffers: shared hit=2 read=10
                       ->  Bitmap Heap Scan on public.directory dir_1  (cost=8858.24..23149.64 rows=1142992 width=114) (actual time=0.952..0.952 rows=1 loops=2)
                             Output: dir_1.id, dir_1.dir_entries, dir_1.file_entries, dir_1.rev_entries
                             Recheck Cond: (dir_1.dir_entries @> ARRAY[dir_entry_d.id])
                             Heap Blocks: exact=2
                             Buffers: shared hit=7 read=5
                             ->  Bitmap Index Scan on directory_dir_entries_idx  (cost=0.00..8572.49 rows=1142992 width=0) (actual time=0.931..0.931 rows=2 loops=2)
                                   Index Cond: (dir_1.dir_entries @> ARRAY[dir_entry_d.id])
                                   Buffers: shared hit=5 read=5
   ->  Sort  (cost=0.28..0.30 rows=11 width=68) (actual time=20.641..20.641 rows=1 loops=1)
         Output: path.dir_id, path.name, path.depth
         Sort Key: path.depth
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=23 read=21
         ->  CTE Scan on path  (cost=0.00..0.22 rows=11 width=68) (actual time=6.570..20.611 rows=3 loops=1)
               Output: path.dir_id, path.name, path.depth
               Buffers: shared hit=23 read=21
 Planning time: 1.463 ms
 Execution time: 20.994 ms
(75 rows)

swh_revision_find_occurrence

shortcut query:

explain (analyze,buffers,verbose)     select origin, branch, revision                                                                                                          
    from occurrence_history as occ_hist
    where occ_hist.revision = '\x1140811ebed3b33b496a3aed32ed16b068ebffab'
    order by upper(occ_hist.validity)  -- TODO filter by authority?
    limit 1;

output : http://explain.depesz.com/s/45Bn

                                                                                                      QUERY PLAN                                                                                                      
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=28.56..28.57 rows=1 width=74) (actual time=769.001..769.001 rows=0 loops=1)
   Output: origin, branch, revision, (upper(validity))
   Buffers: shared hit=67879
   ->  Sort  (cost=28.56..28.58 rows=6 width=74) (actual time=768.998..768.998 rows=0 loops=1)
         Output: origin, branch, revision, (upper(validity))
         Sort Key: (upper(occ_hist.validity))
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=67879
         ->  Index Scan using occurrence_history_origin_branch_revision_authority_validi_excl on public.occurrence_history occ_hist  (cost=0.41..28.53 rows=6 width=74) (actual time=768.977..768.977 rows=0 loops=1)
               Output: origin, branch, revision, upper(validity)
               Index Cond: ((occ_hist.revision)::bytea = '\x1140811ebed3b33b496a3aed32ed16b068ebffab'::bytea)
               Buffers: shared hit=67879
 Planning time: 0.317 ms
 Execution time: 769.112 ms`

naive query :

explain (analyze, buffers, verbose)  select origin, branch, revision
from swh_revision_list_children('\x1140811ebed3b33b496a3aed32ed16b068ebffab') as rev_list(sha1_git)                                                                                                          
left join occurrence_history occ_hist                             
on rev_list.sha1_git = occ_hist.revision
where occ_hist.origin is not null
order by upper(occ_hist.validity)  -- TODO filter by authority?
limit 1;

output: http://explain.depesz.com/s/NKy

                                                                                        QUERY PLAN                                                                                         
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=7449.19..7449.19 rows=1 width=74) (actual time=198.216..198.217 rows=1 loops=1)
   Output: occ_hist.origin, occ_hist.branch, occ_hist.revision, (upper(occ_hist.validity))
   Buffers: shared hit=54065
   ->  Sort  (cost=7449.19..7452.37 rows=1273 width=74) (actual time=198.214..198.214 rows=1 loops=1)
         Output: occ_hist.origin, occ_hist.branch, occ_hist.revision, (upper(occ_hist.validity))
         Sort Key: (upper(occ_hist.validity))
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=54065
         ->  Nested Loop  (cost=1276.21..7442.82 rows=1273 width=74) (actual time=0.780..198.119 rows=59 loops=1)
               Output: occ_hist.origin, occ_hist.branch, occ_hist.revision, upper(occ_hist.validity)
               Buffers: shared hit=54065
               ->  CTE Scan on rev_list  (cost=1275.78..1279.60 rows=191 width=32) (actual time=0.081..134.879 rows=6713 loops=1)
                     Output: rev_list.id
                     Buffers: shared hit=33825
                     CTE rev_list
                       ->  Recursive Union  (cost=0.56..1275.78 rows=191 width=21) (actual time=0.075..128.774 rows=6713 loops=1)
                             Buffers: shared hit=33825
                             ->  Index Only Scan using revision_pkey on public.revision  (cost=0.56..8.58 rows=1 width=21) (actual time=0.065..0.066 rows=1 loops=1)
                                   Output: revision.id
                                   Index Cond: (revision.id = ('\x1140811ebed3b33b496a3aed32ed16b068ebffab'::bytea)::bytea)
                                   Heap Fetches: 1
                                   Buffers: shared hit=5
                             ->  Nested Loop  (cost=4.58..126.34 rows=19 width=21) (actual time=0.017..0.038 rows=2 loops=3073)
                                   Output: h.id
                                   Buffers: shared hit=33820
                                   ->  WorkTable Scan on rev_list rev_list_1  (cost=0.00..0.20 rows=10 width=32) (actual time=0.000..0.001 rows=2 loops=3073)
                                         Output: rev_list_1.id
                                   ->  Bitmap Heap Scan on public.revision_history h  (cost=4.58..12.59 rows=2 width=42) (actual time=0.015..0.015 rows=1 loops=6713)
                                         Output: h.id, h.parent_id, h.parent_rank
                                         Recheck Cond: ((h.parent_id)::bytea = (rev_list_1.id)::bytea)
                                         Heap Blocks: exact=6889
                                         Buffers: shared hit=33820
                                         ->  Bitmap Index Scan on revision_history_parent_id_idx  (cost=0.00..4.58 rows=2 width=0) (actual time=0.012..0.012 rows=1 loops=6713)
                                               Index Cond: ((h.parent_id)::bytea = (rev_list_1.id)::bytea)
                                               Buffers: shared hit=26931
               ->  Index Scan using occurrence_history_revision_idx on public.occurrence_history occ_hist  (cost=0.43..32.17 rows=7 width=74) (actual time=0.009..0.009 rows=0 loops=6713)
                     Output: occ_hist.origin, occ_hist.branch, occ_hist.revision, occ_hist.authority, occ_hist.validity
                     Index Cond: ((occ_hist.revision)::bytea = (rev_list.id)::bytea)
                     Filter: (occ_hist.origin IS NOT NULL)
                     Buffers: shared hit=20240
 Planning time: 1.061 ms
 Execution time: 198.672 ms
(42 rows)

I'm currently toying around with another approach for those recursive queries:

  • Creation of an object_id sequence
  • Addition of an object_id to all kinds of objects (content, skipped_content, directory, revision, release, occurrence_history), that dips from this object_id sequence
  • Creation of a hierarchy_cache table with two columns: parent, child.

The queries to fill in the hierarchy_cache table are pretty simple, and are append-only. For instance for the directory hierarchy:

insert into hierarchy_cache (
    with unnested as (
        select directory.object_id, unnest(directory.dir_entries) as dir
         from directory
    )
    select unnested.object_id as parent, directory.object_id as child
    from unnested
    left join directory_entry_dir e
        on unnested.dir = e.id
    left join directory
        on directory.id = e.target
);

The queries that walk through the history cache recursively are trivial

The object_id column looks like something that we want.

The hierarchy_cache relation takes a (very) long time to populate.

We're now investigating a direct revision -> content shortcut relationship.

zack lowered the priority of this task from High to Normal.Apr 27 2016, 9:02 PM
olasd changed the visibility from "All Users" to "Public (No Login Required)".May 13 2016, 5:06 PM
zack added a subscriber: zack.

looks like this is no longer of interest