- Polish the code
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Jul 15 2021
- Install tree-sitter-cli (NodeJS) during builds
- Generate parser before building swh_ql.so
- Fix installation/build errors
- Generate swh_ql.so at builds
- Fix failing build ( because of data_files )
Jul 14 2021
- Move parser to search_language dir
- Introduce Makefile.local and add TreeSitter related commands
- Set data_files of setup.py to 'generated/search_ql.so'
Jul 13 2021
- Add newline at the end of files
- Add test for sort_by : ["date_created"]
- Deduplicate calculation of some variables in _get_sorting_key
- Use iso8601 library to validate date format in instrinsic_metadata fields
Jul 12 2021
string_content and escape_sequence have been adapted from JSON Treesitter grammar
Jul 9 2021
For input
url : "github.com/django/Django" metadata : something qewq with_visit : true with_visit : false nb_visits >= 0 nb_visits = 10 nb_visits != 256 nb_visits < 1000 sort_by : ["nb_visits", "last_revision_date", last_release_date] last_release_date < 2001-02-13 15:54:21 licenses in ["MIT","BSD X","Apache"]
- I highly recommend you to generate/visualise the corresponding Railroad diagram with https://www.bottlecaps.de/rr/ui
- Checkout P1091 for the Treesitter implementation and some example queries.
Jul 7 2021
- origin_update: Document rejection of metadata date fields if not parsable
Jul 6 2021
- elasticsearch.py: Use "linient: true"
- origin_search: Validate intrinsic_metadata date field format before storing
- test_search: Fix failing tests
Jul 5 2021
In D5964#153292, @vlorentz wrote:Can you either add tests, or deduplicate this code so we don't need to test every field?
- Move get_expansion to utils.py
- Add tests filters as well as sorting options
- Polish existing code
Add commit body
- Squash
- Minor polishes
Jul 2 2021
- origin_search: Polish the code with get_expansion and other methods
- Fix failing doctest
- Add commit description
- Improve doctest for _nested_get
- Squash
Jul 1 2021
- in_memory: Allow list of licenses and programmingLanguages
- test_in_memory: Add test for _nested_get
Jun 30 2021
- in_memory: Use expanded instrinsic_metadata
- test_search: Test for search on multiple instrinsic_metadata ields
- Use analyzer on the list of licenses and langauges
- origin_search: Allow search for multiple licenses and languages at once
- test_search: Separate tests for programming_language and license
- origin_search: Expose language and license from instrinsic_metadata
- test_search: Add test for language and license
If the only issue is the slowdown, can we keep them nested for now, and benchmark later to see if un-nesting is worth it?
@vlorentz, just in case you missed it, the values don't get duplicated. I'm popping them out of instrinsic_metdadata.
I was trying to avoid nested documents as I've read at many places that they slow down searches. (not sure by how much)
Jun 29 2021
Jun 28 2021
Rebase
Jun 26 2021
- origin_search: Polish code related to sort_by
Jun 25 2021
- origin_search: Allow sorting with multiple fields
- interface: Maintain SORT_BY_OPTIONS list
- test_search: Improve tests for sort_by
Jun 23 2021
- Improve commit body
Limit each line in commit message to 80 chars
Update commit messsage body
Can you also please also take care of updating it at the same time?
- rename the type parameter to date_type
- add .vscode to .gitignore
Is it okay if I add .vscode in .gitignore. It often gets included by mistake. swh-indexer, swh-storage and swh-web already have it in their .gitignore files.
- Improve code quality
- Improve code quality (as suggested by @anlambert)
Jun 21 2021
- Changes suggested by vlorentz
Jun 18 2021
- Throw error in absence of storage config
- Add test for fetch_last_revision_release_date in journal_client
- Add missing arguments and tests related to last_release_date
- Include last revision/release date only when available
- merge the fetch_last_*_date methods
- Use snapshot_get_all_branches instead of snapshot_get_branches
- Fix diff/commit description/message
- Use in-memory backend in test_cli.py