Page MenuHomeSoftware Heritage
Feed Advanced Search

Jan 21 2022

anlambert added a comment to T3870: Analyze svn externals repository ingestion with loader.svn v1.0.

Sentry issue: SWH-LOADER-SVN-58

Jan 21 2022, 3:40 PM · SVN Loader
anlambert requested review of D7013: utils: Handle new edge cases in parse_external_definition.
Jan 21 2022, 3:40 PM
anlambert added a revision to T3870: Analyze svn externals repository ingestion with loader.svn v1.0: D7013: utils: Handle new edge cases in parse_external_definition.
Jan 21 2022, 3:38 PM · SVN Loader
anlambert added a comment to D7009: Consider unauthorized access to origin as a not found visit status.

I think it will be better to add a new visit status, like restricted or unauthorized, what do you think ?

Jan 21 2022, 2:39 PM
anlambert updated the task description for T3870: Analyze svn externals repository ingestion with loader.svn v1.0.
Jan 21 2022, 2:21 PM · SVN Loader

Jan 20 2022

anlambert closed T3403: Use forge URL network location as default lister instance name, a subtask of T3127: Compute and display distribution of origins by forge, as Resolved.
Jan 20 2022, 6:44 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task
anlambert closed T3403: Use forge URL network location as default lister instance name as Resolved.

This task has been complete since a while now, closing it.

Jan 20 2022, 6:44 PM · Scheduling utilities, Lister
anlambert closed T3127: Compute and display distribution of origins by forge as Resolved.

Is there a reason not to close this task?

Jan 20 2022, 6:43 PM · Metrics/monitoring, Web app, Roadmap 2021, meta-task

Jan 19 2022

anlambert closed T3862: Sanitize WordPress production database as Resolved.

Time to switch production database to a new sanitized one.

Jan 19 2022, 6:46 PM · Website
anlambert added a comment to T3864: staging: Deploy swh.loader.svn v1.0.0.

@ardumont deployed swh-loader-svn v1.0.0 on staging and restarted the loader service.

Jan 19 2022, 5:30 PM · System administration, SVN Loader
anlambert closed D6977: replay: Prevent removal of external paths overlapping versioned ones.
Jan 19 2022, 4:52 PM
anlambert committed rDLDSVN11740ebc2993: replay: Prevent removal of external paths overlapping versioned ones (authored by anlambert).
replay: Prevent removal of external paths overlapping versioned ones
Jan 19 2022, 4:52 PM
anlambert requested review of D6977: replay: Prevent removal of external paths overlapping versioned ones.
Jan 19 2022, 2:40 PM
anlambert added a revision to T611: support for external definitions in the svn/subversion loader: D6977: replay: Prevent removal of external paths overlapping versioned ones.
Jan 19 2022, 2:38 PM · SVN Loader
anlambert added a comment to T3862: Sanitize WordPress production database.

This is what I have done to sanitize the production database and plug it into our testbed for testing.

Jan 19 2022, 12:58 PM · Website

Jan 18 2022

anlambert lowered the priority of T3862: Sanitize WordPress production database from High to Normal.

So turns out the customizer issue was due to the recent upgrade of the wp-extra-file-types plugin.

Jan 18 2022, 8:36 PM · Website
anlambert triaged T3862: Sanitize WordPress production database as High priority.
Jan 18 2022, 6:55 PM · Website
anlambert closed D6962: Rename ra module to replay.
Jan 18 2022, 2:07 PM
anlambert committed rDLDSVN5df02d6aae97: Rename ra module to replay (authored by anlambert).
Rename ra module to replay
Jan 18 2022, 2:07 PM
anlambert added a comment to D6962: Rename ra module to replay.

Indeed! I have no idea what "ra" means :)

Jan 18 2022, 2:07 PM
anlambert requested review of D6962: Rename ra module to replay.
Jan 18 2022, 1:58 PM
anlambert closed D6950: ra: Send modified objects only to storage after replaying a revision.
Jan 18 2022, 12:50 PM
anlambert committed rDLDSVNf6fbbb789715: ra: Send modified objects only to storage after replaying a revision (authored by anlambert).
ra: Send modified objects only to storage after replaying a revision
Jan 18 2022, 12:50 PM
anlambert closed D6925: ra: Put externals in cache to avoid exporting them again.
Jan 18 2022, 12:50 PM
anlambert committed rDLDSVNa820d7eab8d5: ra: Put externals in cache to avoid exporting them again (authored by anlambert).
ra: Put externals in cache to avoid exporting them again
Jan 18 2022, 12:50 PM
anlambert closed D6895: ra: Add support for subversion external definitions.
Jan 18 2022, 12:50 PM
anlambert committed rDLDSVN473fe145f4b7: ra: Add support for subversion external definitions (authored by anlambert).
ra: Add support for subversion external definitions
Jan 18 2022, 12:50 PM
anlambert closed D6839: utils: Add a function to parse a subversion external definition.
Jan 18 2022, 12:50 PM
anlambert committed rDLDSVNf1913512a5fa: utils: Add a function to parse a subversion external definition (authored by anlambert).
utils: Add a function to parse a subversion external definition
Jan 18 2022, 12:50 PM
anlambert updated the diff for D6950: ra: Send modified objects only to storage after replaying a revision.

Rebase

Jan 18 2022, 12:47 PM
anlambert updated the diff for D6925: ra: Put externals in cache to avoid exporting them again.

Rebase

Jan 18 2022, 12:46 PM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.

Fix wrong rebase

Jan 18 2022, 12:45 PM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.

Update: Add missing root path check in DirEditor.add_directory

Jan 18 2022, 12:40 PM
anlambert updated the task description for T3858: Add diff features for class from_disk.Directory.
Jan 18 2022, 12:05 PM · Data Model
anlambert added a comment to D6950: ra: Send modified objects only to storage after replaying a revision.

Currently the subversion loader is the only one that needs that directories diff feature so I think we can keep the implementation as it is at the
moment but I will create a task to implement the directories diff features in swh.model.from_disk.

Jan 18 2022, 12:04 PM
anlambert triaged T3858: Add diff features for class from_disk.Directory as Normal priority.
Jan 18 2022, 12:03 PM · Data Model
anlambert accepted D6961: Empty atom request request should raise a bad request (400).
Jan 18 2022, 11:37 AM
anlambert updated the diff for D6950: ra: Send modified objects only to storage after replaying a revision.

Rebase

Jan 18 2022, 11:26 AM
anlambert updated the diff for D6925: ra: Put externals in cache to avoid exporting them again.

Rebase

Jan 18 2022, 11:26 AM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.

Remove not needed pass instruction

Jan 18 2022, 11:25 AM
anlambert added inline comments to D6895: ra: Add support for subversion external definitions.
Jan 18 2022, 11:22 AM
anlambert added a comment to D6895: ra: Add support for subversion external definitions.

lgtm but I'm not sure i understood everything (code wise).

Jan 18 2022, 11:18 AM
anlambert updated the diff for D6950: ra: Send modified objects only to storage after replaying a revision.

Rebase

Jan 18 2022, 11:11 AM
anlambert updated the diff for D6925: ra: Put externals in cache to avoid exporting them again.

Rebase

Jan 18 2022, 11:11 AM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.

Rebase

Jan 18 2022, 11:10 AM
anlambert updated the diff for D6839: utils: Add a function to parse a subversion external definition.

Rebase

Jan 18 2022, 11:09 AM
anlambert added inline comments to D6961: Empty atom request request should raise a bad request (400).
Jan 18 2022, 11:06 AM
anlambert updated the diff for D6950: ra: Send modified objects only to storage after replaying a revision.

Rebase

Jan 18 2022, 11:01 AM
anlambert updated the diff for D6925: ra: Put externals in cache to avoid exporting them again.

Update: Preserve symlinks when copying an external tree.

Jan 18 2022, 11:00 AM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.
  • Improve test when all externals got removed from a path
Jan 18 2022, 10:59 AM
anlambert updated the diff for D6839: utils: Add a function to parse a subversion external definition.

Update: Handle a couple of edge cases found when testing with real world repositories.

Jan 18 2022, 10:51 AM

Jan 17 2022

anlambert added a comment to T3854: Fix daily dump of mysql database.

I modified the concerned line in the /srv/data/etc/cron/anacrontab file the following way:

Jan 17 2022, 6:24 PM · Website
anlambert added a comment to T3854: Fix daily dump of mysql database.

It looks like it is not critical if the both tables above are not present in the WordPress database, publications list is still correctly displayed
while list of authors and tags will appear empty in the teachPress admin dashboard.

Jan 17 2022, 6:01 PM · Website
anlambert added a comment to T3854: Fix daily dump of mysql database.

There is two tables with the index too large issue, wp_teachpress_authors and wp_teachpress_tags.

Jan 17 2022, 4:02 PM · Website
anlambert added a comment to T3854: Fix daily dump of mysql database.

So it turns out that we cannot perform any operations on tables whose Index column size too large, sighs ...

Jan 17 2022, 3:43 PM · Website
anlambert added a comment to T3854: Fix daily dump of mysql database.

Using the --single-transaction=TRUE option of mysqldump helped to get the name of the table with Index column size too large.

Jan 17 2022, 2:57 PM · Website
anlambert triaged T3854: Fix daily dump of mysql database as High priority.
Jan 17 2022, 12:03 PM · Website

Jan 14 2022

anlambert updated the diff for D6950: ra: Send modified objects only to storage after replaying a revision.

Add comment

Jan 14 2022, 2:43 PM
anlambert added a comment to D6950: ra: Send modified objects only to storage after replaying a revision.
In D6950#180642, @olasd wrote:

This looks like an impressive speedup, kudos.

Rather than add this logic on the svn loader only, we could consider either making swh.model.from_disk support incremental computations by keeping track of the ctime / mtime of on-disk data, and by collecting objects for the new loader. This would make this logic reusable by all loaders.

Without changing the swh.model.from_disk logic, we could also just diff the sets of objects between iterations (and rely on the OS cache for the new computation to be vaguely efficient), and only send new ones to the storage.

Jan 14 2022, 2:38 PM
anlambert updated the diff for D6950: ra: Send modified objects only to storage after replaying a revision.

Rebase

Jan 14 2022, 1:19 PM
anlambert updated the diff for D6925: ra: Put externals in cache to avoid exporting them again.

Rebase

Jan 14 2022, 1:18 PM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.

Also filter out empty lines and commented ones when parsing external
definitions after a checkout.

Jan 14 2022, 1:17 PM
anlambert updated the diff for D6950: ra: Send modified objects only to storage after replaying a revision.

Rebase

Jan 14 2022, 12:54 PM
anlambert updated the diff for D6925: ra: Put externals in cache to avoid exporting them again.

Rebase

Jan 14 2022, 12:54 PM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.
  • Use context manager to create temporary directory for checkout
  • Update years in test_loader.py license header
Jan 14 2022, 12:52 PM
anlambert requested review of D6950: ra: Send modified objects only to storage after replaying a revision.
Jan 14 2022, 12:19 PM
anlambert added a revision to T3839: Optimize SVN loader performance and memory consumption on large repositories: D6950: ra: Send modified objects only to storage after replaying a revision.
Jan 14 2022, 12:17 PM · SVN Loader
anlambert updated the diff for D6925: ra: Put externals in cache to avoid exporting them again.

Rebase

Jan 14 2022, 12:12 PM
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.

Optimize subversion export operation when dealing with externals: use the
origin URL as export parameter only if we know that some externals are
defined as relative to the repository URL and targets a path outside the
repository.

Jan 14 2022, 12:10 PM
anlambert updated the diff for D6839: utils: Add a function to parse a subversion external definition.

Add a boolean in the tuple returned by the parse function indicating
if the external URL was defined as relative to the repository one
bu targets a path outside the repository.

Jan 14 2022, 12:01 PM
anlambert triaged T3848: Activate saved origin browse link only when loading data are available in database as Normal priority.
Jan 14 2022, 11:42 AM · Save Code Now, Web app

Jan 13 2022

anlambert accepted D6893: Increase retries for random walks from 5 to 10.

Great, thanks !

Jan 13 2022, 4:06 PM
anlambert committed rDWAPPS3809abf63da7: Makefile.local: Pass --frozen-lockfile option to yarn install (authored by anlambert).
Makefile.local: Pass --frozen-lockfile option to yarn install
Jan 13 2022, 3:38 PM
anlambert closed D6942: browse/snapshot_context: Fix revisions log link display regression.
Jan 13 2022, 3:38 PM
anlambert committed rDWAPPS68ec3d83ab5d: browse/snapshot_context: Fix revisions log link display regression (authored by anlambert).
browse/snapshot_context: Fix revisions log link display regression
Jan 13 2022, 3:38 PM
anlambert requested review of D6942: browse/snapshot_context: Fix revisions log link display regression.
Jan 13 2022, 3:32 PM
anlambert closed D6941: common/origin_save: Fix elasticsearch request.
Jan 13 2022, 2:29 PM
anlambert committed rDWAPPS491576531af7: common/origin_save: Fix elasticsearch request (authored by anlambert).
common/origin_save: Fix elasticsearch request
Jan 13 2022, 2:29 PM
anlambert requested review of D6941: common/origin_save: Fix elasticsearch request.
Jan 13 2022, 1:02 PM
anlambert created P1255 (An Untitled Masterwork).
Jan 13 2022, 12:38 PM

Jan 12 2022

anlambert added a comment to T3845: Requesting cooking with email address + "Enter " key returns an error.

This is because the vault modal does not contain a real HTML form so there is no submit event sent when pressing Enter.

Jan 12 2022, 6:35 PM · Web app
anlambert closed T3836: Define and implement an anti-DoS policy for graph visits using the max_edges parameter as Resolved.

Anti-DoS policy has been implemented and deployed. The max_edges thresholds can be easily changed by configuration.

Jan 12 2022, 6:31 PM · Web app
anlambert closed T3840: "'NoneType' object has no attribute 'split'" on /browse/snapshot/log/ as Resolved.

Fixed and deployed.

Jan 12 2022, 6:17 PM · Web app
anlambert closed D6914: api/graph: Implement anti-DoS policies for graph visits.
Jan 12 2022, 3:00 PM
anlambert closed D6919: api/graph: Handle query parameters that might be passed in graph_query.
Jan 12 2022, 3:00 PM
anlambert committed rDWAPPS9e7732ccebe3: api/graph: Handle query parameters that might be passed in graph_query (authored by anlambert).
api/graph: Handle query parameters that might be passed in graph_query
Jan 12 2022, 3:00 PM
anlambert committed rDWAPPSe870ec2b9a40: api/graph: Implement anti-DoS policies for graph visits (authored by anlambert).
api/graph: Implement anti-DoS policies for graph visits
Jan 12 2022, 3:00 PM
anlambert requested review of D6925: ra: Put externals in cache to avoid exporting them again.
Jan 12 2022, 2:59 PM
anlambert added inline comments to D6895: ra: Add support for subversion external definitions.
Jan 12 2022, 2:58 PM
anlambert added a revision to T611: support for external definitions in the svn/subversion loader: D6925: ra: Put externals in cache to avoid exporting them again.
Jan 12 2022, 2:57 PM · SVN Loader
anlambert updated the diff for D6895: ra: Add support for subversion external definitions.

Rebase and ensure path of external root directory is stored in the external paths set.

Jan 12 2022, 2:55 PM
anlambert updated the diff for D6839: utils: Add a function to parse a subversion external definition.

Rebase

Jan 12 2022, 2:53 PM
anlambert updated the diff for D6914: api/graph: Implement anti-DoS policies for graph visits.

Rebase

Jan 12 2022, 2:38 PM
anlambert updated the diff for D6919: api/graph: Handle query parameters that might be passed in graph_query.

Use urlunparse.

Jan 12 2022, 2:35 PM
anlambert added inline comments to D6919: api/graph: Handle query parameters that might be passed in graph_query.
Jan 12 2022, 2:34 PM
anlambert added inline comments to D6895: ra: Add support for subversion external definitions.
Jan 12 2022, 1:22 PM
anlambert added a comment to D6895: ra: Add support for subversion external definitions.

How do the tests check integrity of the results? (ie. there are the right revisions, the right files, etc.)

Jan 12 2022, 1:14 PM
anlambert added a comment to D6895: ra: Add support for subversion external definitions.

Does the incremental loader update the remote repo if the svn:externals property was not touched?

Jan 12 2022, 1:06 PM
anlambert accepted D6889: cassandra: Make content_missing run in linear time instead of quadratic.

It should be faster indeed, looks good to me.

Jan 12 2022, 11:36 AM
anlambert accepted D6888: cassandra: Rewrite content_missing to run queries concurrently..

Looks good to me.

Jan 12 2022, 11:27 AM