Page MenuHomeSoftware Heritage

faux (mihir karbelkar)
User

Projects

User does not belong to any projects.

User Details

User Since
Mar 21 2019, 4:00 PM (12 w, 4 d)

Recent Activity

Fri, May 24

faux added a comment to T1446: Add support for slices in Storage.content_get.

Still open?

Fri, May 24, 1:28 PM · Storage manager

Tue, May 21

faux closed T1349: Storage.content_find should return all matches, not just one. as Resolved.
Tue, May 21, 9:53 AM · Easy hack, Storage manager

May 16 2019

faux added a project to T1721: Implementation of Gogs Lister: Archive coverage.
May 16 2019, 2:50 PM · Archive coverage
faux triaged T1721: Implementation of Gogs Lister as Low priority.
May 16 2019, 2:46 PM · Archive coverage
faux added a comment to D1420: Made changes to adapt it to new content_find return type.

hehehehe.

May 16 2019, 2:21 PM
faux committed rDWAPPSed61da4501cf: Made changes to adapt it to new content_find return type and added the test for… (authored by faux).
Made changes to adapt it to new content_find return type and added the test for…
May 16 2019, 2:20 PM
faux closed D1420: Made changes to adapt it to new content_find return type.
May 16 2019, 2:20 PM
faux added a comment to D1288: Storage.content_find returns list instead of single value.

Dayummmm! I did it. Thanks @vlorentz

May 16 2019, 1:54 PM
faux committed rDSTO02134a705a12: Changes the output of content_find method to a list in case of hash collisions… (authored by faux).
Changes the output of content_find method to a list in case of hash collisions…
May 16 2019, 1:53 PM
faux closed D1288: Storage.content_find returns list instead of single value.
May 16 2019, 1:53 PM
faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

hmmmm

May 16 2019, 1:38 PM
faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

Rebased to master made changes

May 16 2019, 12:42 PM
faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Hopefully that does it.

May 16 2019, 11:10 AM

May 15 2019

faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Used the for loop once removed new_checksum_dict as it was not needed.

May 15 2019, 11:04 PM
faux updated the diff for D1288: Storage.content_find returns list instead of single value.

First build failed uploading again

May 15 2019, 10:38 PM
faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

Have made the changes.

May 15 2019, 9:19 PM
faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

Have used unknown_revision, unknown_directory and unknown_content

May 15 2019, 7:32 PM
faux added a comment to T1709: implement an R-cran lister.

@nahimilega it is probably a two line script. install R and do readRDS() and you will get a data.frame object which is just like a table and has columns and then you can extract what you want. Cheers :). BTW when I did readRDS it retrieved a lot of links and I don't know about the lister that much but you can pickup from there.

May 15 2019, 2:06 PM · GSoC 2019, Archive coverage

May 14 2019

faux added a comment to D1420: Made changes to adapt it to new content_find return type.

we have to give dir_path some value
so I have also tried doing revision_['target'] = unknown_content['sha1_git'] but the notfoundexc is never raised

Yes, look how test_lookup_directory_with_revision_with_path does it

Er, you already did that, and it indeed raises an exception, look at Jenkins' logs:

You can reproduce this example by temporarily adding @reproduce_failure('4.23.4', b'AXicY2TAD5iwiDESqQ5FDwAB3AAI') as a decorator on your test case
Falsifying example: test_lookup_directory_with_revision_unknown_content(self=<swh.web.tests.common.test_service.ServiceTestCase testMethod=test_lookup_directory_with_revision_unknown_content>, revision='500a697730a27eabdcdf8af4aa6137819e72f171', unknown_content={'blake2S256': '0000000000000000000000000000000000000000000000000000000000000002',
 'sha1': '0000000000000000000000000000000000000001',
 'sha1_git': '0000000000000000000000000000000000000002',
 'sha256': '0000000000000000000000000000000000000000000000000000000000000001'})
Traceback (most recent call last):
  File "/home/jenkins/workspace/DWAPPS/tox/.tox/py3/lib/python3.5/site-packages/swh/web/tests/common/test_service.py", line 359, in test_lookup_directory_with_revision_unknown_content
    cm.exception.args[0])
  File "/usr/lib/python3.5/unittest/case.py", line 1080, in assertIn
    self.fail(self._formatMessage(msg, standardMsg))
  File "/usr/lib/python3.5/unittest/case.py", line 666, in fail
    raise self.failureException(msg)
AssertionError: 'Content not found for 500a697730a27eabdcdf8af4aa6137819e72f171' not found in "Directory or File 'README.md' pointed to by revision 500a697730a27eabdcdf8af4aa6137819e72f171 not found"

The only issue here is that your assertion does not match the error message

May 14 2019, 4:25 PM
faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

Now everything works. Thanks @vlorentz.

May 14 2019, 4:24 PM

May 10 2019

faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

So basically if we want to raise notfoundexc by content_find we have to give
dir_path some value otherwise it sets entity['type'] as 'dir' by default and hence
elif condition will not be met so I have also tried doing revision_['target'] = unknown_content['sha1_git'] but the notfoundexc is never raised.
what should I do?

May 10 2019, 9:36 PM

Apr 29 2019

faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

Have added the test for you to check if I am going in the correct direction.
I took these values from the terminal when I used print(revision, unknown_content)
to print the values of those.... Also have changed my implementation of "return the first element in the list otherwise none" by using the _first_element function
just to make it a bit more concise and easy to understand.

Apr 29 2019, 6:29 AM

Apr 28 2019

faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

I am getting an error when adding the test so I thought it would be better to get a review.
Also the error occurs sometimes with file unrelated to this change. About this change it shows that it is generating inconsistent data
which I am unable to diagnose.

Apr 28 2019, 12:46 AM

Apr 16 2019

faux updated the diff for D1420: Made changes to adapt it to new content_find return type.

Made the required changes

Apr 16 2019, 4:37 PM
Herald added a reviewer for D1420: Made changes to adapt it to new content_find return type: Reviewers.
Apr 16 2019, 2:27 PM
faux added a revision to T1349: Storage.content_find should return all matches, not just one.: D1420: Made changes to adapt it to new content_find return type.
Apr 16 2019, 2:27 PM · Easy hack, Storage manager
faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Missed a function in in_memory

Apr 16 2019, 2:19 PM

Apr 14 2019

faux added a comment to T808: phabricator lister.

Sure go ahead ;)

Apr 14 2019, 2:13 PM · Easy hack, Phabricator forge
faux added a comment to T808: phabricator lister.

Hey, I was wondering if @nahimilega is still working on this? If not then can I poke into this as it would be a good practice for me before implementing launchpad lister and gogs lister. Thanks :)

Apr 14 2019, 7:01 AM · Easy hack, Phabricator forge

Apr 13 2019

faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Made the required changes

Apr 13 2019, 11:22 AM
faux added a comment to D1288: Storage.content_find returns list instead of single value.

Sorry pushed the wrong thing...

Apr 13 2019, 11:18 AM
faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Merged the for loop removed assertEqual for length and removed if from content find

Apr 13 2019, 11:14 AM
faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Added the tests for colliding sha256 and blake2s256 hashes

Apr 13 2019, 9:25 AM

Apr 12 2019

faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Added more tests for content_find.

Apr 12 2019, 4:37 PM
faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Removed the if(s), for loop, LIMIT ALL. Made the query a bit more readable.

Apr 12 2019, 5:47 AM

Apr 11 2019

faux added a comment to D1288: Storage.content_find returns list instead of single value.

Almost there. Just stuck on the query part. I do think it is similar to https://forge.softwareheritage.org/D1345#inline-8002.

Apr 11 2019, 3:38 PM

Apr 10 2019

faux updated the diff for D1288: Storage.content_find returns list instead of single value.

Have made the requested changes
In db.py : I have changed the content_find method to make the sql query on python side
In test_storage.py : Have added test for duplicate content

Apr 10 2019, 11:35 PM

Apr 2 2019

faux added a comment to D1288: Storage.content_find returns list instead of single value.

fwiw, from irc discussion:

20:07 <faux__> pinkieval: I have made all the required changes but I am still confused about the test as in what should we do when the content is duplicated? Sorry to be a bit late as I was travelling so rarely had internet connectivity
20:15 <+pinkieval> faux__: The content is not duplicated in the existing tests. You must add a new test where the content is duplicated, to see how content_find behaves
20:17 <faux__> By using content_add right? I did that but apparently content_find only finds one result and not two of the same result in the list
20:31 <+pinkieval> the goal of your change is to make content_find find more than one
10:17 <+ardumont> because content_add filters on existing contents so if you inject the same content twice, you will have only 1 content in the db

I also found out the same thing when I was adding duplicate content to database using content_add........ so if it will automatically filter duplicate data then content_find should return only one data as a list, right?.....

Apr 2 2019, 5:30 AM

Mar 27 2019

faux added a comment to D1288: Storage.content_find returns list instead of single value.

Thanks, will get back to you with required changes.

Mar 27 2019, 4:53 PM

Mar 24 2019

faux updated the summary of D1288: Storage.content_find returns list instead of single value.
Mar 24 2019, 6:12 PM
faux added a revision to T1349: Storage.content_find should return all matches, not just one.: D1288: Storage.content_find returns list instead of single value.
Mar 24 2019, 6:10 PM · Easy hack, Storage manager
Herald added 1 required legal document(s) to D1288: Storage.content_find returns list instead of single value: L3 Software Heritage Contributor License Agreement, version 1.0.
Mar 24 2019, 6:10 PM