Page MenuHomeSoftware Heritage
Feed Advanced Search

Jun 26 2019

twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.
"""Django view that produces an HTML display of a content identified
    by its hash value.
Jun 26 2019, 6:37 PM · Documentation, Easy hack

Jun 25 2019

twitu added a comment to T1839: Write glossary/taxonomy for push archival process and mechanism.

https://docs.softwareheritage.org/devel/apidoc/swh.model.html#module-swh.model.identifiers
This explains some of the fields associated with dates, timestamps and offset.

Jun 25 2019, 6:31 PM · Community Building, Documentation
twitu closed T1527: Have comments on all columns of all databases as Resolved.
Jun 25 2019, 6:25 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to T1613: Add a public API endpoint to get the metadata of an origin.

This task is completed it can be closed.

Jun 25 2019, 6:24 PM · Easy hack, Metadata workflow, Web app
twitu committed rDWAPPS9a8e1f98c9fd: Add origin_metadata_get API endpoint (authored by twitu).
Add origin_metadata_get API endpoint
Jun 25 2019, 6:10 PM
twitu closed D1623: Add origin_metadata_get API endpoint.
Jun 25 2019, 6:10 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Merge with master

Jun 25 2019, 5:57 PM
twitu added a comment to D1623: Add origin_metadata_get API endpoint.

Ready to land made all the required changes.

Jun 25 2019, 5:47 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Remove tox dependency

Jun 25 2019, 5:41 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Remove + symbol from end of line

Jun 25 2019, 5:40 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Fix typo

Jun 25 2019, 5:31 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Add assert_called_once_with to test case

Jun 25 2019, 5:18 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Fix test case. Silly mistakes are costly.

Jun 25 2019, 5:09 AM

Jun 24 2019

twitu added inline comments to D1623: Add origin_metadata_get API endpoint.
Jun 24 2019, 7:26 PM
twitu added a comment to T881: PostgreSQL backups based on pg_dump.

While I was adding comments to all the tables in the db, I experimented a bit with pgdump.

Some databases could benefit from some backups without the overhead of having point in time recovery set up for them

If I understand correctly, it means that recovery from all previous time stamps is not a concern here. In such a case, is a chron job running pgdump at regular intervals feasible?

Jun 24 2019, 6:54 PM · System administration
twitu updated the task description for T1527: Have comments on all columns of all databases.
Jun 24 2019, 6:28 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu updated the summary of D1623: Add origin_metadata_get API endpoint.
Jun 24 2019, 6:14 PM
twitu updated the summary of D1623: Add origin_metadata_get API endpoint.
Jun 24 2019, 6:13 PM
twitu added inline comments to D1623: Add origin_metadata_get API endpoint.
Jun 24 2019, 5:55 PM

Jun 23 2019

twitu added a comment to P448 Pytest error stack trace.

reference to D1623

Jun 23 2019, 2:57 PM
twitu created P448 Pytest error stack trace in the S1 Public space.
Jun 23 2019, 2:55 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.
  • Change key reference from origin_id to id
Jun 23 2019, 2:48 PM

Jun 22 2019

twitu created P447 Run a single test file in the S1 Public space.
Jun 22 2019, 12:26 PM
twitu added a comment to P446 Multiple context managers.
with patch('swh.web.common.service.idx_storage') as mock_idx_storage, \
             patch('swh.web.common.service.storage') as mock_storage:

Adding backslash solves it

Jun 22 2019, 10:06 AM
twitu added a comment to P446 Multiple context managers.
with patch('swh.web.common.service.idx_storage') as mock_idx_storage,
             patch('swh.web.common.service.storage') as mock_storage:

giving sytax error on mock_idx_storage

Jun 22 2019, 9:56 AM
twitu added a comment to P446 Multiple context managers.
Jun 22 2019, 9:56 AM
twitu added a comment to P446 Multiple context managers.

flake8 gives syntax error on first as

Jun 22 2019, 9:51 AM
twitu created P446 Multiple context managers in the S1 Public space.
Jun 22 2019, 9:50 AM
twitu added a comment to T1839: Write glossary/taxonomy for push archival process and mechanism.

I too, two noticed the two glossaries. They have varying levels of explanations and merging them is one of the tasks I have considered for my GSoD application. I will be glad to assist in the process, in any way I can.

Jun 22 2019, 8:07 AM · Community Building, Documentation

Jun 21 2019

twitu added inline comments to D1623: Add origin_metadata_get API endpoint.
Jun 21 2019, 9:18 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.
  • Remove unnecessary tox.ini dependency
Jun 21 2019, 9:15 PM
twitu updated the diff for D1623: Add origin_metadata_get API endpoint.
  • Fix intrinsic metadata enpoint and test case
Jun 21 2019, 9:13 PM
twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.

I have checked autopep8 and black, none of these claim to reformat long doc strings properly. This may require custom scripting or editing it by hand. However autopep8 did suggest some other changes when I tried it out. Is there a reason a linter is used instead of a formatter?

Jun 21 2019, 8:54 AM · Documentation, Easy hack
twitu added a comment to D1623: Add origin_metadata_get API endpoint.

in reference to T1613
I am returning entry for only one origin id. Is this expected functionality?

Jun 21 2019, 8:48 AM
Herald added a reviewer for D1623: Add origin_metadata_get API endpoint: Reviewers.
Jun 21 2019, 8:45 AM

Jun 20 2019

twitu added a comment to T1613: Add a public API endpoint to get the metadata of an origin.

I wan to clarify the return value for this api. In swh-storage, table origin_metadata contains the metadata of an origin for a visit, listing etc. The primary key is an auto incremented value. This means that there can be multiple entries for a single origin_id. What is the expected JSON response for this API? I can include origin_id and a list of metadata values, is there anything else to return?

Jun 20 2019, 5:28 PM · Easy hack, Metadata workflow, Web app
twitu added a comment to T1613: Add a public API endpoint to get the metadata of an origin.

Ok I'll try to implement and test this by tomorrow.

Jun 20 2019, 2:48 PM · Easy hack, Metadata workflow, Web app
twitu added a comment to T1613: Add a public API endpoint to get the metadata of an origin.

This is will also require adding a function to in swh/web/common/service.py that makes the query to swh-storage. The response will then have to be converted to a json response. I would like to take this up and add this api. Please suggest if there is wrong with the changes I am suggesting.

Jun 20 2019, 2:41 PM · Easy hack, Metadata workflow, Web app
twitu added a comment to T1527: Have comments on all columns of all databases.

D1582 has been pushed the task can be closed

Jun 20 2019, 10:26 AM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu committed rDMOD10728848a0ca: Add pyblake2 platform specific dependency (authored by twitu).
Add pyblake2 platform specific dependency
Jun 20 2019, 10:17 AM
twitu closed D1574: Added pyblake2 in py3 test dependency.
Jun 20 2019, 10:17 AM
twitu committed rDSTO2ead4ce360ba: Added comments for all tables and columns (authored by twitu).
Added comments for all tables and columns
Jun 20 2019, 10:14 AM
twitu closed D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
Jun 20 2019, 10:14 AM
twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.

Add comments to tables and columns

Jun 20 2019, 9:20 AM

Jun 19 2019

twitu committed rDSCH09e724573495: Added comments to few columns in dbversion, task and task_run (authored by twitu).
Added comments to few columns in dbversion, task and task_run
Jun 19 2019, 6:29 PM
twitu closed D1590: Add comments to few columns in dbversion, task and task_run.
Jun 19 2019, 6:29 PM
twitu updated the diff for D1590: Add comments to few columns in dbversion, task and task_run.

Add comments to few columns in dbversion, task and task_run

Jun 19 2019, 6:14 PM
twitu updated the diff for D1590: Add comments to few columns in dbversion, task and task_run.
  • Added comments to few columns in dbversion, task and task_run
  • Made changes as per review
Jun 19 2019, 5:06 PM
twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • Make changes as per review
Jun 19 2019, 5:02 PM
twitu added a comment to T1613: Add a public API endpoint to get the metadata of an origin.

If this issue is open, I can work on this. I believe I have to add the endpoint in /swh-web/swh/web/api/views/origin.py. I will probably have to populate the db with an origin and test the api by making http requests, or is there a better way?

Jun 19 2019, 1:48 PM · Easy hack, Metadata workflow, Web app
twitu updated the diff for D1574: Added pyblake2 in py3 test dependency.

Merge branch with master

Jun 19 2019, 1:32 PM

Jun 18 2019

twitu added inline comments to D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
Jun 18 2019, 4:39 PM
twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • List all options
Jun 18 2019, 4:36 PM
twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • Make suggested changes
Jun 18 2019, 4:28 PM
twitu added a comment to D1574: Added pyblake2 in py3 test dependency.

Foolishly committed to master earlier, created new branch now

Jun 18 2019, 3:52 PM
twitu updated the diff for D1574: Added pyblake2 in py3 test dependency.

Add pyblake2 dependency for python<3.6

Jun 18 2019, 3:51 PM
twitu updated the diff for D1574: Added pyblake2 in py3 test dependency.
  • Move platform specific dependency to install_requires
Jun 18 2019, 3:08 PM
twitu added a comment to D1574: Added pyblake2 in py3 test dependency.

I built and tested the wheel locally, it did install pyblake2 as per the requirements.

Jun 18 2019, 2:59 PM
twitu updated the diff for D1590: Add comments to few columns in dbversion, task and task_run.
  • Make changes as per review
Jun 18 2019, 2:30 PM
twitu updated the diff for D1574: Added pyblake2 in py3 test dependency.

Add pyblake2 dependency for all python versions below 3.6

Jun 18 2019, 2:21 PM
twitu added a comment to D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.

Does this diff require any other changes?

Jun 18 2019, 1:52 PM
twitu added a comment to D1590: Add comments to few columns in dbversion, task and task_run.

Does this diff require any other changes?

Jun 18 2019, 1:52 PM
twitu added a comment to D1574: Added pyblake2 in py3 test dependency.
	blake2_requirements = []
	
	pyblake2_hash_sets = [
	    # Built-in implementation in Python 3.6+
	    {'blake2s', 'blake2b'},
	    # Potentially shipped by OpenSSL 1.1 (e.g. Python 3.5 in Debian stretch
	    # has these)
	    {'blake2s256', 'blake2b512'},
	]
	
	for pyblake2_hashes in pyblake2_hash_sets:
	    if not pyblake2_hashes - set(hashlib.algorithms_available):
	        # The required blake2 hashes have been found
	        break
	else:
	    # None of the possible sets of blake2 hashes are available.
	    # use pyblake2 instead
	    blake2_requirements.append('pyblake2')
Jun 18 2019, 1:19 PM

Jun 17 2019

twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • Double quote syntax error
Jun 17 2019, 6:49 PM
twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • Add additional column comments
Jun 17 2019, 6:00 PM
twitu added a comment to D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.

The comments are not complete. I will fix the syntax issue in the next commit along with the missing comments. I am not sure what comments to add for the revision table.

Jun 17 2019, 3:20 PM
twitu added a comment to T1815: Use a FOSS alternative or drop Google ReCAPTCHA use.

Django-simple-captcha works best out of the box using Forms or ModelForms. But the origin/save page is not rendered using forms, its plain HTML. One possible solution is to use a Form for origin save submission, the other is to write custom captcha template in and include it in the page. Which one did you have in mind?

Jun 17 2019, 2:18 PM · Web app

Jun 15 2019

twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • Added comments to tables dbversion, content, skipped_content and fetch_history
  • Added comments for all tables and columns
  • Converted double quote comments to single quote comments
Jun 15 2019, 7:26 PM
twitu updated the diff for D1590: Add comments to few columns in dbversion, task and task_run.
  • Change double quoted comments to single quoted comments
Jun 15 2019, 7:20 PM
twitu updated the diff for D1590: Add comments to few columns in dbversion, task and task_run.
  • Change double quoted comments to single quoted comments
Jun 15 2019, 7:16 PM
twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • Add comments to few columns in dbversion, task and task_run
  • Change double quoted comments to single quoted comments
Jun 15 2019, 7:15 PM
twitu added a comment to T1527: Have comments on all columns of all databases.

All columns commented in swh-scheduler, waiting review.
Some columns for swh-storage required a small discussion to frame appropriate comments.

Jun 15 2019, 5:22 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to D1590: Add comments to few columns in dbversion, task and task_run.

in reference to T1527

Jun 15 2019, 5:18 PM
Herald added a reviewer for D1590: Add comments to few columns in dbversion, task and task_run: Reviewers.
Jun 15 2019, 5:17 PM

Jun 14 2019

twitu added inline comments to D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
Jun 14 2019, 9:29 PM
twitu updated the diff for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
  • Added comments for all tables and columns
Jun 14 2019, 9:17 PM
twitu added inline comments to D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.
Jun 14 2019, 5:30 PM
twitu added a comment to T1527: Have comments on all columns of all databases.

All columns are already commented in swh-indexer

Jun 14 2019, 5:18 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to T1527: Have comments on all columns of all databases.

Have added a few comments in D1582

Jun 14 2019, 8:30 AM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to D1582: Add comments to tables dbversion, content, skipped_content and fetch_history.

This diff is in reference to T1527.

Jun 14 2019, 8:20 AM
Herald added a reviewer for D1582: Add comments to tables dbversion, content, skipped_content and fetch_history: Reviewers.
Jun 14 2019, 8:19 AM

Jun 13 2019

twitu added a comment to T1527: Have comments on all columns of all databases.

there seems to be an inconsistency between sql/upgrades and latest sql version in swh-storage. The latest upgrade is 136.sql while the version in 30-swh-schema.sql is 133. Should I name the next upgrade 137?

Jun 13 2019, 6:54 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to D1574: Added pyblake2 in py3 test dependency.

I am on a standard Ubuntu 16.04 Xenial distribution. Is there any specific config files you want to see.

Jun 13 2019, 4:45 PM
twitu added a comment to D1574: Added pyblake2 in py3 test dependency.
.tox/py3/lib/python3.5/site-packages/swh/model/hashutil.py:211: in _new_hashlib_hash
    return _new_blake2_hash(algo)
Jun 13 2019, 1:11 PM
twitu added a comment to T1527: Have comments on all columns of all databases.

is there anything left to be done to close the task?

Jun 13 2019, 12:09 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to D1574: Added pyblake2 in py3 test dependency.

I was clearing existing tox setup with rm -rf .tox, when I changed requirements. I will try tests of other modules as well and report if I find similar errors.

Jun 13 2019, 10:48 AM

Jun 12 2019

twitu added a comment to T1527: Have comments on all columns of all databases.

modules swh-scheduler, swh-indexer, swh-storage, all seem to have column comments written in 30-swh-schema.sql

Jun 12 2019, 7:44 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to D1574: Added pyblake2 in py3 test dependency.

https://forge.softwareheritage.org/P429
this is the error message from one of the tests. Results in all py3 tests failing.

Jun 12 2019, 6:28 PM
twitu created P429 error message without pyblake2 in the S1 Public space.
Jun 12 2019, 6:27 PM
twitu added a comment to T1527: Have comments on all columns of all databases.

Can you provide a few more details so I can work on this? Maybe which packages will be affected and what is expected in the comments.

Jun 12 2019, 6:21 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer
twitu updated the diff for D1574: Added pyblake2 in py3 test dependency.
  • moved dependency to requirements-test.txt
Jun 12 2019, 5:25 PM
Herald added 1 required legal document(s) to D1574: Added pyblake2 in py3 test dependency: L3 Software Heritage Contributor License Agreement, version 1.0.
Jun 12 2019, 5:03 PM