Page MenuHomeSoftware Heritage

twitu (Ishan Bhanuka)
User

Projects

User does not belong to any projects.

User Details

User Since
May 11 2019, 6:37 PM (14 w, 9 h)

Recent Activity

Wed, Aug 14

twitu added a comment to D1720: Modify API output and test.
In D1720#42908, @twitu wrote:

Overiding just the test method doesn't work for TestFossologyLicenseRangeIndexer. Tests test__index_contents and test__index_contents_with_indexed_data call the function self.indexer._index_contents which is implemented in ContentRangeIndexer.
It calls a function index which is not implemented and it yields value when the docstring says it should return a dictionary. I am not sure how to resolve this.

Wed, Aug 14, 5:36 AM
twitu updated the diff for D1720: Modify API output and test.
  • Fix return value for license indexer
Wed, Aug 14, 5:27 AM

Tue, Aug 13

twitu added a comment to D1720: Modify API output and test.

Overiding just the test method doesn't work for TestFossologyLicenseRangeIndexer. Tests test__index_contents and test__index_contents_with_indexed_data call the function

Tue, Aug 13, 10:42 PM
twitu updated the diff for D1720: Modify API output and test.
  • Fix return value for license indexer
Tue, Aug 13, 10:24 PM
twitu updated the diff for D1720: Modify API output and test.
  • Fix return value for license indexer
Tue, Aug 13, 9:59 PM

Mon, Aug 12

twitu updated the diff for D1720: Modify API output and test.
  • Override test methods
Mon, Aug 12, 5:28 PM
twitu updated the diff for D1720: Modify API output and test.
  • Override test methods
Mon, Aug 12, 5:22 PM
twitu added a comment to D1720: Modify API output and test.

It appears that two different tests TestFossologyLicenseRangeIndexer and TestMimetypeRangeIndexer are inheriting from the same class namely, CommonContentIndexerRangeTest as a result both use the same assert_results_ok method. Both test classes have the same structure for expected results, hence the test works for both, as I experimented in the previous commit, changing the test for the new format of TestFossologyLicenseRangeIndexer will fail for TestMimetypeRangeIndexer. How do I resolve this? Should I override the method in both classes?

Mon, Aug 12, 5:50 AM

Sun, Aug 11

twitu updated the diff for D1720: Modify API output and test.
  • Change tests and expected results
Sun, Aug 11, 4:26 PM

Thu, Aug 8

twitu committed rDSTOCf71f5318a1eb: Add support for skipped content in in-memory storage (authored by twitu).
Add support for skipped content in in-memory storage
Thu, Aug 8, 4:17 PM

Tue, Aug 6

twitu added a comment to D1720: Modify API output and test.

I haven't been able to give enough time to it. I'll complete the diff by this weekend.

Tue, Aug 6, 4:30 AM

Mon, Jul 22

twitu closed T1633: skipped_content_missing is not implemented by the in-memory storage as Resolved.
Mon, Jul 22, 5:12 PM · Easy hack, Storage manager
twitu committed rDSTOf71f5318a1eb: Add support for skipped content in in-memory storage (authored by twitu).
Add support for skipped content in in-memory storage
Mon, Jul 22, 5:11 PM
twitu closed D1693: Get skipped content that are missing data.
Mon, Jul 22, 5:11 PM
twitu updated the diff for D1693: Get skipped content that are missing data.

Squash and change commit message

Mon, Jul 22, 5:08 PM
twitu updated the diff for D1693: Get skipped content that are missing data.
  • Refactoring
Mon, Jul 22, 4:55 PM
twitu added inline comments to D1693: Get skipped content that are missing data.
Mon, Jul 22, 4:55 PM
twitu updated the diff for D1720: Modify API output and test.
  • Change expected result
Mon, Jul 22, 3:53 PM

Sun, Jul 21

twitu updated the diff for D1720: Modify API output and test.
  • Fix test
Sun, Jul 21, 10:44 PM
twitu updated the diff for D1693: Get skipped content that are missing data.

Fixed skipped_content counter bug

Sun, Jul 21, 10:18 PM

Fri, Jul 19

twitu updated the diff for D1693: Get skipped content that are missing data.

Rebase on master

Fri, Jul 19, 5:37 PM
twitu updated the diff for D1693: Get skipped content that are missing data.

Rebase and update

Fri, Jul 19, 5:30 PM
twitu updated the diff for D1693: Get skipped content that are missing data.
  • Change index storage mechanism
Fri, Jul 19, 5:20 PM
twitu added a comment to P447 Run a single test file.

Best solution is tox -- -k test_name

Fri, Jul 19, 5:07 PM

Jul 16 2019

twitu updated the diff for D1693: Get skipped content that are missing data.
  • Add break to prevent multiple yields
Jul 16 2019, 8:22 PM
twitu updated the diff for D1693: Get skipped content that are missing data.
  • Use all hashes in a content
Jul 16 2019, 6:51 PM

Jul 13 2019

twitu updated the diff for D1693: Get skipped content that are missing data.
  • Remove dependency
Jul 13 2019, 9:21 AM
twitu updated the diff for D1693: Get skipped content that are missing data.
  • Modify in_memory content_add to add skipped_content
Jul 13 2019, 9:21 AM
twitu added a comment to D1693: Get skipped content that are missing data.

I did not find any mechanism in db.py that is actually storing skipped_content. db.py line 51, is passed without implementation.

Jul 13 2019, 9:15 AM
twitu added a comment to D1693: Get skipped content that are missing data.

In db.py line 128, the query does not compare blake2s256 despite content_hash_keys = ['sha1', 'sha1_git', 'sha256', 'blake2s256']. Does this mean that skipped content will never be hashed with blake2s256?

Jul 13 2019, 8:43 AM
twitu added a comment to D1693: Get skipped content that are missing data.

I have a concern here, storage.py line 120. The function self.content_missing can throw an exception in case of a hash collision. Shouldn't line 120 be in a try except block to catch that error and ignore that particular content?

Jul 13 2019, 6:25 AM

Jul 11 2019

twitu updated the diff for D1720: Modify API output and test.
  • Modify output and test
  • In reference to T1433
  • In reference to T1433
  • Use db_transaction annotation
Jul 11 2019, 8:24 PM
twitu updated the diff for D1720: Modify API output and test.

Use db_transaction annotation

Jul 11 2019, 8:12 PM

Jul 10 2019

twitu created P463 tox test stack trace for swh-indexer in the S1 Public space.
Jul 10 2019, 6:11 PM
twitu retitled D1720: Modify API output and test from Modify output and test to Modify API output and test.
Jul 10 2019, 5:39 PM
twitu updated the diff for D1720: Modify API output and test.

small edit and rebase

Jul 10 2019, 5:36 PM
Herald added a reviewer for D1720: Modify API output and test: Reviewers.
Jul 10 2019, 5:31 PM

Jul 9 2019

twitu committed rDSTOC2ead4ce360ba: Added comments for all tables and columns (authored by twitu).
Added comments for all tables and columns
Jul 9 2019, 3:07 PM
twitu added a comment to T1433: Refactor output of indexer storage's `get` methods..

I went through all the tests in test_storage.py. It appears that only content_fossology_license_get needs to be refactored. All other storage methods return a dictionary or a list of dictionaries, where each dictionary has multiple keys.

Jul 9 2019, 5:28 AM · Easy hack, Indexer

Jul 7 2019

twitu added a comment to T1433: Refactor output of indexer storage's `get` methods..

I am familiar with the web APIs and I went through the discussion in T782. When you say output a single dictionary, I believe you mean something like this

{
  sha1: [
    {tool: TOOL, licenses: [licences]},
    {tool: TOOL, licenses: [licences]}
  ],
Jul 7 2019, 4:23 PM · Easy hack, Indexer

Jul 6 2019

twitu added a comment to D1693: Get skipped content that are missing data.

This logic is similar to one being used in db.skipped_content_missing. However the current implementation in_memory._content_add does not populate _skipped_contents and _skipped_content_indexes.

Jul 6 2019, 9:30 AM
Herald added a reviewer for D1693: Get skipped content that are missing data: Reviewers.
Jul 6 2019, 9:25 AM

Jul 2 2019

twitu added a comment to T1758: consistently document the configuration option of each module.

After re-reading the documentation, I realized that the configuration files given in swh-indexer is actually used by swh-scheduler and swh-storage, indicating that these two modules are at the root of the dependency tree.

Jul 2 2019, 7:36 PM · Development documentation
twitu added a comment to T1758: consistently document the configuration option of each module.

I looked at how the docs look after being built, for e.g. take swh-indexer at https://docs.softwareheritage.org/devel/swh-indexer/dev-info.html. It seems like the configuration information along with instructions to run and test it are best suited to this page. Have you considered adding comments about configuration parameters in this page itself, rather than making a top level file, because because only someone hacking on swh-indexer would be interested in the configuration.

Jul 2 2019, 4:45 PM · Development documentation
twitu added a comment to T1758: consistently document the configuration option of each module.

I can begin working on it, once I understand what is required. Is my interpretation of the task correct?

Jul 2 2019, 12:55 PM · Development documentation
twitu closed T1864: Inkscape is not mentioned as dependency for building swh-docs as Resolved.
Jul 2 2019, 4:51 AM · Development documentation
twitu committed rDDOC20790ca8a5b5: Add inkscape to required tools in README (authored by twitu).
Add inkscape to required tools in README
Jul 2 2019, 4:50 AM
twitu closed D1673: Add inkscape to required tools in README.
Jul 2 2019, 4:50 AM

Jul 1 2019

Herald added a reviewer for D1673: Add inkscape to required tools in README: Reviewers.
Jul 1 2019, 6:39 PM

Jun 29 2019

twitu added a comment to T1758: consistently document the configuration option of each module.

This is related to T1388.

Jun 29 2019, 2:41 PM · Development documentation
twitu created T1864: Inkscape is not mentioned as dependency for building swh-docs in the S1 Public space.
Jun 29 2019, 2:36 PM · Development documentation

Jun 28 2019

twitu added a reverting change for rDDEP216d0f74d8c3: Reformat docstrings for max line length: rDDEP4c4324788ae0: Revert "Reformat docstrings for max line length".
Jun 28 2019, 3:04 PM
twitu committed rDDEP4c4324788ae0: Revert "Reformat docstrings for max line length" (authored by twitu).
Revert "Reformat docstrings for max line length"
Jun 28 2019, 3:04 PM
twitu closed D1658: Revert "Reformat docstrings for max line length".
Jun 28 2019, 3:04 PM
twitu updated the diff for D1658: Revert "Reformat docstrings for max line length".

Rebase from master

Jun 28 2019, 3:03 PM
twitu updated subscribers of D1658: Revert "Reformat docstrings for max line length".

To expand on this further, I think the pre-push hook scripts you have configured only work when using git in a terminal. I am using vs code and curiously, I used the gui to push changes. I believe the scripts are not able to check such a situation and I can commit changes without review. This is probably a security flaw, that should be considered seriously. @zack

Jun 28 2019, 5:47 AM
Herald added a reviewer for D1658: Revert "Reformat docstrings for max line length": Reviewers.
Jun 28 2019, 5:35 AM
twitu added a reverting change for rDDEP216d0f74d8c3: Reformat docstrings for max line length: D1658: Revert "Reformat docstrings for max line length".
Jun 28 2019, 5:35 AM
twitu closed T1836: Reformat docstrings that exceed 80 columns as Resolved.
Jun 28 2019, 5:25 AM · Development documentation, Easy hack
twitu committed rDWAPPSb2555fabb0fc: Reformatted docstrings wherever possible (authored by twitu).
Reformatted docstrings wherever possible
Jun 28 2019, 5:24 AM
twitu closed D1650: Reformat docstring in utils.py.
Jun 28 2019, 5:24 AM
twitu committed rDDEP216d0f74d8c3: Reformat docstrings for max line length (authored by twitu).
Reformat docstrings for max line length
Jun 28 2019, 5:23 AM
twitu added a comment to D1650: Reformat docstring in utils.py.

It seems like my revision is already in origin master. There are no changes to push. Should I close this revision.

Jun 28 2019, 4:54 AM

Jun 27 2019

twitu updated the diff for D1650: Reformat docstring in utils.py.

Rebased master

Jun 27 2019, 8:16 PM
twitu added a comment to D1650: Reformat docstring in utils.py.

Aren't all doc strings rendered at some place with documentation for packages, sub modules the functions they contain?

Jun 27 2019, 7:59 PM
twitu added a comment to D1650: Reformat docstring in utils.py.

While formatting the documents I realized that whoever was writing was try to keep line lengths short but they followed an arbitrary line length which was longer than 80 character. Since # noqa was applied these formatting errors did not pop up. I think a lot of this can be resolved if there can be a guideline to add a vertical ruler to the editor at 80 chars.

Jun 27 2019, 7:51 PM
twitu updated the diff for D1650: Reformat docstring in utils.py.

Fix more docstrings

Jun 27 2019, 7:50 PM
twitu updated the task description for T1836: Reformat docstrings that exceed 80 columns.
Jun 27 2019, 7:40 PM · Development documentation, Easy hack
twitu committed rDLDSVN3667f7165353: Remove unnecessary noqa (authored by twitu).
Remove unnecessary noqa
Jun 27 2019, 7:24 PM
twitu closed D1655: Remove unnecessary noqa.
Jun 27 2019, 7:24 PM
twitu updated the diff for D1650: Reformat docstring in utils.py.

Reformat docstring wherever possible

Jun 27 2019, 7:24 PM
twitu committed rDMODdde39f51c0fa: Reformat docstring for max line length (authored by twitu).
Reformat docstring for max line length
Jun 27 2019, 6:48 PM
twitu closed D1649: Reformat docstring for max line length.
Jun 27 2019, 6:48 PM
Herald added a reviewer for D1655: Remove unnecessary noqa: Reviewers.
Jun 27 2019, 6:48 PM
twitu updated the diff for D1649: Reformat docstring for max line length.

Reformat docstring for max line length

Jun 27 2019, 6:44 PM
twitu added inline comments to D1649: Reformat docstring for max line length.
Jun 27 2019, 4:13 PM
twitu updated the task description for T1836: Reformat docstrings that exceed 80 columns.
Jun 27 2019, 4:11 PM · Development documentation, Easy hack
twitu abandoned D1648: Reformat docstring for max line length.
Jun 27 2019, 4:11 PM
twitu added a comment to D1648: Reformat docstring for max line length.

I'll close this revision.

Jun 27 2019, 4:11 PM
twitu added a comment to D1648: Reformat docstring for max line length.

I get it now, that's an ingenious way of keeping documentation up to date. However if this is evaluated why is adding whitespace changing the output, the dictionary and list should still have valid items.

Jun 27 2019, 3:55 PM
twitu added a comment to D1648: Reformat docstring for max line length.

Then swh-indexer does not require any changes I will close this diff. Wow, I never new docstrings could be evaluated. Why is this required though?

Jun 27 2019, 3:48 PM
twitu abandoned D1647: Reformat docstrings for max line length.
Jun 27 2019, 3:46 PM
twitu added a comment to D1647: Reformat docstrings for max line length.

I will close this diff since no changes required for swh-deposit. With complex cases like this there is no way this process can be automated.

Jun 27 2019, 3:45 PM
twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.

I followed swh-docker-dev documentation to host the setup locally. But it only hosts the web portal, I couldn't access locally hosted documentation. How can I see the effect my changes are making?

Jun 27 2019, 6:51 AM · Development documentation, Easy hack
Herald added a reviewer for D1650: Reformat docstring in utils.py: Reviewers.
Jun 27 2019, 6:48 AM
twitu updated the summary of D1649: Reformat docstring for max line length.
Jun 27 2019, 6:30 AM
twitu updated the summary of D1648: Reformat docstring for max line length.
Jun 27 2019, 6:30 AM
Herald added a reviewer for D1649: Reformat docstring for max line length: Reviewers.
Jun 27 2019, 6:23 AM
twitu added a comment to D1647: Reformat docstrings for max line length.

It is odd that py3 test cases are failing although I have only changed to formatting of the docstrings.

Jun 27 2019, 6:23 AM
Herald added a reviewer for D1648: Reformat docstring for max line length: Reviewers.
Jun 27 2019, 6:19 AM
twitu updated the summary of D1647: Reformat docstrings for max line length.
Jun 27 2019, 6:08 AM
Herald added a reviewer for D1647: Reformat docstrings for max line length: Reviewers.
Jun 27 2019, 6:08 AM

Jun 26 2019

twitu updated the task description for T1836: Reformat docstrings that exceed 80 columns.
Jun 26 2019, 7:11 PM · Development documentation, Easy hack
twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.

what is the expected formatting for snippets like these

the 80 character mark is -----------------------------------------------------------------|
@browse_route(r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/', # noqa
              r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/(?P<path>.+)/', # noqa
              r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/directory/', # noqa
              r'origin/(?P<origin_type>[a-z]+)/url/(?P<origin_url>.+)/directory/(?P<path>.+)/', # noqa
              r'origin/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/', # noqa
              r'origin/(?P<origin_url>.+)/visit/(?P<timestamp>.+)/directory/(?P<path>.+)/', # noqa
              r'origin/(?P<origin_url>.+)/directory/', # noqa
              r'origin/(?P<origin_url>.+)/directory/(?P<path>.+)/', # noqa
              view_name='browse-origin-directory')
def origin_directory_browse(request, origin_url, origin_type=None,
                            timestamp=None, path=None):
    """Django view for browsing the content of a directory associated
    to an origin for a given visit.
Jun 26 2019, 6:51 PM · Development documentation, Easy hack
twitu added a comment to T1836: Reformat docstrings that exceed 80 columns.
"""Django view that produces an HTML display of a content identified
    by its hash value.
Jun 26 2019, 6:37 PM · Development documentation, Easy hack

Jun 25 2019

twitu added a comment to T1839: Write glossary/taxonomy for push archival process and mechanism.

https://docs.softwareheritage.org/devel/apidoc/swh.model.html#module-swh.model.identifiers
This explains some of the fields associated with dates, timestamps and offset.

Jun 25 2019, 6:31 PM · Scientific Community Building, SWORD deposit
twitu closed T1527: Have comments on all columns of all databases as Resolved.
Jun 25 2019, 6:25 PM · Easy hack, Development documentation, Storage manager, Scheduling utilities, Indexer
twitu added a comment to T1613: Add a public API endpoint to get the metadata of an origin.

This task is completed it can be closed.

Jun 25 2019, 6:24 PM · Easy hack, Metadata workflow, Web app
twitu committed rDWAPPS9a8e1f98c9fd: Add origin_metadata_get API endpoint (authored by twitu).
Add origin_metadata_get API endpoint
Jun 25 2019, 6:10 PM
twitu closed D1623: Add origin_metadata_get API endpoint.
Jun 25 2019, 6:10 PM