In db.py line 128, the query does not compare blake2s256 despite content_hash_keys = ['sha1', 'sha1_git', 'sha256', 'blake2s256']. Does this mean that skipped content will never be hashed with blake2s256?

Jul 13 2019, 8:43 AM

twitu added a comment to D1693: Get skipped content that are missing data.

I have a concern here, storage.py line 120. The function self.content_missing can throw an exception in case of a hash collision. Shouldn't line 120 be in a try except block to catch that error and ignore that particular content?

Jul 13 2019, 6:25 AM

Jul 11 2019

twitu updated the diff for D1720: Modify API output and test.

Modify output and test
In reference to T1433
In reference to T1433
Use db_transaction annotation

Jul 11 2019, 8:24 PM

twitu updated the diff for D1720: Modify API output and test.

Use db_transaction annotation

Jul 11 2019, 8:12 PM

Jul 10 2019

twitu created P463 tox test stack trace for swh-indexer in the S1 Public space.

Jul 10 2019, 6:11 PM

twitu retitled D1720: Modify API output and test from Modify output and test to Modify API output and test.

Jul 10 2019, 5:39 PM

twitu updated the diff for D1720: Modify API output and test.

small edit and rebase

Jul 10 2019, 5:36 PM

Herald added a reviewer for D1720: Modify API output and test: Reviewers.

Jul 10 2019, 5:31 PM

Jul 9 2019

twitu committed rDSTOC2ead4ce360ba: Added comments for all tables and columns (authored by twitu).

Added comments for all tables and columns

Jul 9 2019, 3:07 PM

twitu added a comment to T1433: Refactor output of indexer storage's `get` methods..

I went through all the tests in test_storage.py. It appears that only content_fossology_license_get needs to be refactored. All other storage methods return a dictionary or a list of dictionaries, where each dictionary has multiple keys.

Jul 9 2019, 5:28 AM · Easy hack, Indexer

Jul 7 2019

twitu added a comment to T1433: Refactor output of indexer storage's `get` methods..

I am familiar with the web APIs and I went through the discussion in T782. When you say output a single dictionary, I believe you mean something like this

{
  sha1: [
    {tool: TOOL, licenses: [licences]},
    {tool: TOOL, licenses: [licences]}
  ],

Jul 7 2019, 4:23 PM · Easy hack, Indexer

Jul 6 2019

twitu added a comment to D1693: Get skipped content that are missing data.

This logic is similar to one being used in db.skipped_content_missing. However the current implementation in_memory._content_add does not populate _skipped_contents and _skipped_content_indexes.

Jul 6 2019, 9:30 AM

Herald added a reviewer for D1693: Get skipped content that are missing data: Reviewers.

Jul 6 2019, 9:25 AM

Jul 2 2019

twitu added a comment to T1758: consistently document the configuration option of each module.

After re-reading the documentation, I realized that the configuration files given in swh-indexer is actually used by swh-scheduler and swh-storage, indicating that these two modules are at the root of the dependency tree.

Jul 2 2019, 7:36 PM · Easy hack, Documentation

twitu added a comment to T1758: consistently document the configuration option of each module.

I looked at how the docs look after being built, for e.g. take swh-indexer at https://docs.softwareheritage.org/devel/swh-indexer/dev-info.html. It seems like the configuration information along with instructions to run and test it are best suited to this page. Have you considered adding comments about configuration parameters in this page itself, rather than making a top level file, because because only someone hacking on swh-indexer would be interested in the configuration.

Jul 2 2019, 4:45 PM · Easy hack, Documentation

twitu added a comment to T1758: consistently document the configuration option of each module.

I can begin working on it, once I understand what is required. Is my interpretation of the task correct?

Jul 2 2019, 12:55 PM · Easy hack, Documentation

twitu closed T1864: Inkscape is not mentioned as dependency for building swh-docs as Resolved.

Jul 2 2019, 4:51 AM · Documentation

twitu committed rDDOC20790ca8a5b5: Add inkscape to required tools in README (authored by twitu).

Add inkscape to required tools in README

Jul 2 2019, 4:50 AM

twitu closed D1673: Add inkscape to required tools in README.

Jul 2 2019, 4:50 AM

Jul 1 2019

Herald added a reviewer for D1673: Add inkscape to required tools in README: Reviewers.

Jul 1 2019, 6:39 PM

Jun 29 2019

twitu added a comment to T1758: consistently document the configuration option of each module.

This is related to T1388.

Jun 29 2019, 2:41 PM · Easy hack, Documentation

twitu created T1864: Inkscape is not mentioned as dependency for building swh-docs in the S1 Public space.

Jun 29 2019, 2:36 PM · Documentation

Jun 28 2019

twitu added a reverting change for rDDEP216d0f74d8c3: Reformat docstrings for max line length: rDDEP4c4324788ae0: Revert "Reformat docstrings for max line length".

Jun 28 2019, 3:04 PM

twitu committed rDDEP4c4324788ae0: Revert "Reformat docstrings for max line length" (authored by twitu).

Revert "Reformat docstrings for max line length"

Jun 28 2019, 3:04 PM

twitu closed D1658: Revert "Reformat docstrings for max line length".

Jun 28 2019, 3:04 PM

twitu updated the diff for D1658: Revert "Reformat docstrings for max line length".

Rebase from master

Jun 28 2019, 3:03 PM

twitu updated subscribers of D1658: Revert "Reformat docstrings for max line length".

To expand on this further, I think the pre-push hook scripts you have configured only work when using git in a terminal. I am using vs code and curiously, I used the gui to push changes. I believe the scripts are not able to check such a situation and I can commit changes without review. This is probably a security flaw, that should be considered seriously. @zack

Jun 28 2019, 5:47 AM

Herald added a reviewer for D1658: Revert "Reformat docstrings for max line length": Reviewers.

Jun 28 2019, 5:35 AM

twitu added a reverting change for rDDEP216d0f74d8c3: Reformat docstrings for max line length: D1658: Revert "Reformat docstrings for max line length".

Jun 28 2019, 5:35 AM

twitu closed T1836: Reformat docstrings that exceed 80 columns as Resolved.

Jun 28 2019, 5:25 AM · Documentation, Easy hack

twitu committed rDWAPPSb2555fabb0fc: Reformatted docstrings wherever possible (authored by twitu).

Reformatted docstrings wherever possible

Jun 28 2019, 5:24 AM

twitu closed D1650: Reformat docstring in utils.py.

Jun 28 2019, 5:24 AM

twitu committed rDDEP216d0f74d8c3: Reformat docstrings for max line length (authored by twitu).

Reformat docstrings for max line length

Jun 28 2019, 5:23 AM

twitu added a comment to D1650: Reformat docstring in utils.py.

It seems like my revision is already in origin master. There are no changes to push. Should I close this revision.

Jun 28 2019, 4:54 AM

Jun 27 2019

twitu updated the diff for D1650: Reformat docstring in utils.py.

Rebased master

Jun 27 2019, 8:16 PM

twitu added a comment to D1650: Reformat docstring in utils.py.

Aren't all doc strings rendered at some place with documentation for packages, sub modules the functions they contain?

Jun 27 2019, 7:59 PM

twitu added a comment to D1650: Reformat docstring in utils.py.

While formatting the documents I realized that whoever was writing was try to keep line lengths short but they followed an arbitrary line length which was longer than 80 character. Since # noqa was applied these formatting errors did not pop up. I think a lot of this can be resolved if there can be a guideline to add a vertical ruler to the editor at 80 chars.

Jun 27 2019, 7:51 PM

twitu updated the diff for D1650: Reformat docstring in utils.py.

Fix more docstrings

Jun 27 2019, 7:50 PM

twitu updated the task description for T1836: Reformat docstrings that exceed 80 columns.

Jun 27 2019, 7:40 PM · Documentation, Easy hack

twitu committed rDLDSVN3667f7165353: Remove unnecessary noqa (authored by twitu).

Remove unnecessary noqa

Jun 27 2019, 7:24 PM

twitu closed D1655: Remove unnecessary noqa.

Jun 27 2019, 7:24 PM

twitu updated the diff for D1650: Reformat docstring in utils.py.

Reformat docstring wherever possible

Jun 27 2019, 7:24 PM

twitu committed rDMODdde39f51c0fa: Reformat docstring for max line length (authored by twitu).

Reformat docstring for max line length

Jun 27 2019, 6:48 PM

twitu closed D1649: Reformat docstring for max line length.

Jun 27 2019, 6:48 PM

Herald added a reviewer for D1655: Remove unnecessary noqa: Reviewers.

Jun 27 2019, 6:48 PM

twitu updated the diff for D1649: Reformat docstring for max line length.

Reformat docstring for max line length

Jun 27 2019, 6:44 PM

twitu added inline comments to D1649: Reformat docstring for max line length.

Jun 27 2019, 4:13 PM

twitu updated the task description for T1836: Reformat docstrings that exceed 80 columns.

Jun 27 2019, 4:11 PM · Documentation, Easy hack

twitu abandoned D1648: Reformat docstring for max line length.

Jun 27 2019, 4:11 PM

twitu added a comment to D1648: Reformat docstring for max line length.

I'll close this revision.

Jun 27 2019, 4:11 PM

twitu added a comment to D1648: Reformat docstring for max line length.

I get it now, that's an ingenious way of keeping documentation up to date. However if this is evaluated why is adding whitespace changing the output, the dictionary and list should still have valid items.

Jun 27 2019, 3:55 PM

twitu added a comment to D1648: Reformat docstring for max line length.

Then swh-indexer does not require any changes I will close this diff. Wow, I never new docstrings could be evaluated. Why is this required though?

Jun 27 2019, 3:48 PM

twitu abandoned D1647: Reformat docstrings for max line length.

Jun 27 2019, 3:46 PM

twitu added a comment to D1647: Reformat docstrings for max line length.

I will close this diff since no changes required for swh-deposit. With complex cases like this there is no way this process can be automated.

Jun 27 2019, 3:45 PM

twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.

I followed swh-docker-dev documentation to host the setup locally. But it only hosts the web portal, I couldn't access locally hosted documentation. How can I see the effect my changes are making?

Jun 27 2019, 6:51 AM · Documentation, Easy hack

Herald added a reviewer for D1650: Reformat docstring in utils.py: Reviewers.

Jun 27 2019, 6:48 AM

twitu updated the summary of D1649: Reformat docstring for max line length.

Jun 27 2019, 6:30 AM

twitu updated the summary of D1648: Reformat docstring for max line length.

Jun 27 2019, 6:30 AM

Herald added a reviewer for D1649: Reformat docstring for max line length: Reviewers.

Jun 27 2019, 6:23 AM

twitu added a comment to D1647: Reformat docstrings for max line length.

It is odd that py3 test cases are failing although I have only changed to formatting of the docstrings.

Jun 27 2019, 6:23 AM

Herald added a reviewer for D1648: Reformat docstring for max line length: Reviewers.

Jun 27 2019, 6:19 AM

twitu updated the summary of D1647: Reformat docstrings for max line length.

Jun 27 2019, 6:08 AM

Herald added a reviewer for D1647: Reformat docstrings for max line length: Reviewers.

Jun 27 2019, 6:08 AM

Jun 26 2019

twitu updated the task description for T1836: Reformat docstrings that exceed 80 columns.

Jun 26 2019, 7:11 PM · Documentation, Easy hack

twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.

what is the expected formatting for snippets like these

the 80 character mark is -----------------------------------------------------------------|
@browse_route(r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/', # noqa
              r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/(?P<path>.+)/', # noqa
              r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/directory/', # noqa
              r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/directory/(?P<path>.+)/', # noqa
              r'origin/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/', # noqa
              r'origin/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/(?P<path>.+)/', # noqa
              r'origin/(?P<origin_url>.+)/directory/', # noqa
              r'origin/(?P<origin_url>.+)/directory/(?P<path>.+)/', # noqa
              view_name='browse-origin-directory')
def origin_directory_browse(request, origin_url, origin_type=None,
                            timestamp=None, path=None):
    """Django view for browsing the content of a directory associated
    to an origin for a given visit.

Jun 26 2019, 6:51 PM · Documentation, Easy hack

twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.

"""Django view that produces an HTML display of a content identified
    by its hash value.

Jun 26 2019, 6:37 PM · Documentation, Easy hack

Jun 25 2019

twitu added a comment to T1839: Write glossary/taxonomy for push archival process and mechanism.

https://docs.softwareheritage.org/devel/apidoc/swh.model.html#module-swh.model.identifiers
This explains some of the fields associated with dates, timestamps and offset.

Jun 25 2019, 6:31 PM · Community Building, Documentation

twitu closed T1527: Have comments on all columns of all databases as Resolved.

Jun 25 2019, 6:25 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer

twitu added a comment to T1613: Add a public API endpoint to get the metadata of an origin.

This task is completed it can be closed.

Jun 25 2019, 6:24 PM · Easy hack, Metadata workflow, Web app

twitu committed rDWAPPS9a8e1f98c9fd: Add origin_metadata_get API endpoint (authored by twitu).

Add origin_metadata_get API endpoint

Jun 25 2019, 6:10 PM

twitu closed D1623: Add origin_metadata_get API endpoint.

Jun 25 2019, 6:10 PM

twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Merge with master

Jun 25 2019, 5:57 PM

twitu added a comment to D1623: Add origin_metadata_get API endpoint.

Ready to land made all the required changes.

Jun 25 2019, 5:47 PM

twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Remove tox dependency

Jun 25 2019, 5:41 PM

twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Remove + symbol from end of line

Jun 25 2019, 5:40 PM

twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Fix typo

Jun 25 2019, 5:31 PM

twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Add assert_called_once_with to test case

Jun 25 2019, 5:18 PM

twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Fix test case. Silly mistakes are costly.

Jun 25 2019, 5:09 AM

Jun 24 2019

twitu added inline comments to D1623: Add origin_metadata_get API endpoint.

Jun 24 2019, 7:26 PM

twitu added a comment to T881: PostgreSQL backups based on pg_dump.

While I was adding comments to all the tables in the db, I experimented a bit with pgdump.

Some databases could benefit from some backups without the overhead of having point in time recovery set up for them

If I understand correctly, it means that recovery from all previous time stamps is not a concern here. In such a case, is a chron job running pgdump at regular intervals feasible?

Jun 24 2019, 6:54 PM · System administration

twitu updated the task description for T1527: Have comments on all columns of all databases.

Jun 24 2019, 6:28 PM · Easy hack, Documentation, Storage manager, Scheduling utilities, Indexer

twitu updated the summary of D1623: Add origin_metadata_get API endpoint.

Jun 24 2019, 6:14 PM

twitu updated the summary of D1623: Add origin_metadata_get API endpoint.

Jun 24 2019, 6:13 PM

twitu added inline comments to D1623: Add origin_metadata_get API endpoint.

Jun 24 2019, 5:55 PM

Jun 23 2019

twitu added a comment to P448 Pytest error stack trace.

reference to D1623

Jun 23 2019, 2:57 PM

twitu created P448 Pytest error stack trace in the S1 Public space.

Jun 23 2019, 2:55 PM

twitu updated the diff for D1623: Add origin_metadata_get API endpoint.

Change key reference from origin_id to id

Jun 23 2019, 2:48 PM

Jun 22 2019

twitu created P447 Run a single test file in the S1 Public space.

Jun 22 2019, 12:26 PM

twitu added a comment to P446 Multiple context managers.

with patch('swh.web.common.service.idx_storage') as mock_idx_storage, \
             patch('swh.web.common.service.storage') as mock_storage:

Adding backslash solves it

Jun 22 2019, 10:06 AM

twitu added a comment to P446 Multiple context managers.

with patch('swh.web.common.service.idx_storage') as mock_idx_storage,
             patch('swh.web.common.service.storage') as mock_storage:

giving sytax error on mock_idx_storage

Jun 22 2019, 9:56 AM

twitu added a comment to P446 Multiple context managers.

Jun 22 2019, 9:56 AM

Advanced SearchUse ResultsEdit QueryHide Query

Jul 19 2019

Jul 16 2019

Jul 13 2019

Jul 11 2019

Jul 10 2019

Jul 9 2019

Jul 7 2019

Jul 6 2019

Jul 2 2019

Jul 1 2019

Jun 29 2019

Jun 28 2019

Jun 27 2019

Jun 26 2019

Jun 25 2019

Jun 24 2019

Jun 23 2019

Jun 22 2019

Advanced Search
Use Results
Edit Query
Hide Query