User Details
User Details
- User Since
- Feb 27 2018, 4:45 PM (360 w, 5 d)
Jun 22 2018
Jun 22 2018
Adjustments of the report.
Adjustments of the report.
Adjustments of the report.
Jun 21 2018
Jun 21 2018
More ameliorations of the report.
yuanyin committed R131:c799e8e01fa2: Report rectified. Detailed results available alongside the report (authored by yuanyin).
Report rectified. Detailed results available alongside the report
Jun 18 2018
Jun 18 2018
yuanyin committed R131:432826e6870c: classify(path) exposed as top level interface. (authored by yuanyin).
classify(path) exposed as top level interface.
Jun 17 2018
Jun 17 2018
Minor changes. Software Heritage cited.
yuanyin committed R131:8dd763658ff5: Report completed with new results on new database with 374 languages. (authored by yuanyin).
Report completed with new results on new database with 374 languages.
Jun 16 2018
Jun 16 2018
yuanyin committed R131:4ef85d97d9ea: Report completed. Old results took off, waiting for new results. (authored by yuanyin).
Report completed. Old results took off, waiting for new results.
Jun 15 2018
Jun 15 2018
yuanyin committed R131:77a17e99fac3: Section 7 almost completed. New language lists. Checker bugs fixed. (authored by yuanyin).
Section 7 almost completed. New language lists. Checker bugs fixed.
Jun 12 2018
Jun 12 2018
Section 6 & 7 almost finished.
Jun 11 2018
Jun 11 2018
Section 6 started. More figures added.
Jun 10 2018
Jun 10 2018
yuanyin committed R131:c993ae6201cc: Add classification of file to class CNN. (authored by yuanyin).
Add classification of file to class CNN.
yuanyin committed R131:f967ed67a708: A manual checking tool for Software Heritage. (authored by yuanyin).
A manual checking tool for Software Heritage.
Jun 9 2018
Jun 9 2018
yuanyin committed R131:42805303c440: Section 5 adjusted and enriched. Figures added. (authored by yuanyin).
Section 5 adjusted and enriched. Figures added.
yuanyin committed R131:dd4208aa929c: Section 4 almost completed, Section 5 adjusted. (authored by yuanyin).
Section 4 almost completed, Section 5 adjusted.
Jun 7 2018
Jun 7 2018
yuanyin committed R131:02bb5448ef25: Metric adjustment and new sections. (Incomplete sections marked with "Ongoing". (authored by yuanyin).
Metric adjustment and new sections. (Incomplete sections marked with "Ongoing".
Jun 5 2018
Jun 5 2018
yuanyin committed R131:07c232881a2d: Model validity metric migrated to F1 from recall. More sections of the report. (authored by yuanyin).
Model validity metric migrated to F1 from recall. More sections of the report.
Jun 4 2018
Jun 4 2018
Half of Section 4, start of Section 5
Jun 3 2018
Jun 3 2018
Progression on the report.
May 29 2018
May 29 2018
First commit of report.
May 27 2018
May 27 2018
yuanyin committed R131:bb237d755db1: New experiments for a smaller language set. (authored by yuanyin).
New experiments for a smaller language set.
May 2 2018
May 2 2018
Minor fixes.
May 1 2018
May 1 2018
yuanyin committed R131:45f373b6a313: New tokenizer for word-level classification. (authored by yuanyin).
New tokenizer for word-level classification.
Apr 30 2018
Apr 30 2018
yuanyin committed R131:b47bcdfb00e6: A little bit of unsupervised clustering. (authored by yuanyin).
A little bit of unsupervised clustering.
Apr 29 2018
Apr 29 2018
yuanyin committed R131:8a224a8a1d42: N-grams frequency method updated to use .csv dataset. (authored by yuanyin).
N-grams frequency method updated to use .csv dataset.
Apr 24 2018
Apr 24 2018
yuanyin committed R131:0057e4eef222: cnn.py: training on Spark; new graphs of results. (authored by yuanyin).
cnn.py: training on Spark; new graphs of results.
yuanyin committed R131:fb1dd5acaed3: New results of word-level ConvNet method. (authored by yuanyin).
New results of word-level ConvNet method.
Apr 23 2018
Apr 23 2018
yuanyin committed R131:6e9fbb3856db: cnn_w.py: strings and numbers replaced by special token. (authored by yuanyin).
cnn_w.py: strings and numbers replaced by special token.
Minor fixes.
cnn.py: import updated.
yuanyin committed R131:6a0d1065d248: Word-level ConvNet added. Early stopping added to character-level ConvNet. (authored by yuanyin).
Word-level ConvNet added. Early stopping added to character-level ConvNet.
Apr 22 2018
Apr 22 2018
Local path issue fixed.
Apr 18 2018
Apr 18 2018
Requirements updated.
import update.
Apr 12 2018
Apr 12 2018
CNN: minor fixes.
CNN: minor fixes.
CNN: new interface of command line.
Apr 11 2018
Apr 11 2018
CNN: minor fixes
Complete CNN trainer and tester.
cnn.py: new training function
Apr 10 2018
Apr 10 2018
Guesslang approach.
Apr 3 2018
Apr 3 2018
yuanyin committed R131:6981189aa7fc: Two new approaches: Multinominal Naive Bayes and ConvNet (authored by yuanyin).
Two new approaches: Multinominal Naive Bayes and ConvNet
Mar 27 2018
Mar 27 2018
yuanyin committed R131:5f70120fa1fc: Make tests interruptible between classes. (authored by yuanyin).
Make tests interruptible between classes.
Mar 26 2018
Mar 26 2018
yuanyin committed R131:fa645f9f780d: Add n-grams approach with probabilities. (authored by yuanyin).
Add n-grams approach with probabilities.
Mar 24 2018
Mar 24 2018
Benchmark of average process time.
Mar 23 2018
Mar 23 2018
Visualisation of test results.
Mar 21 2018
Mar 21 2018
Improvement for faster tests.
yuanyin committed R131:6fbf8a92a5a1: Initial version of baseline method: letters n-grams with frequency distance. (authored by yuanyin).
Initial version of baseline method: letters n-grams with frequency distance.
Mar 19 2018
Mar 19 2018
yuanyin committed R131:2659bcd1c3c7: Add a simple test code. Minor bugs in utils fixed. (authored by yuanyin).
Add a simple test code. Minor bugs in utils fixed.
Mar 18 2018
Mar 18 2018
Add n-gram model training.
Mar 17 2018
Mar 17 2018
Minor fixes.
yuanyin committed R131:6b163a269ff6: Lister fixed, clone into 2-level folders. (authored by yuanyin).
Lister fixed, clone into 2-level folders.
Mar 15 2018
Mar 15 2018
yuanyin committed R131:c0793ddaf470: Language list from Linguist introduced. Add other repository lists. (authored by yuanyin).
Language list from Linguist introduced. Add other repository lists.
Mar 13 2018
Mar 13 2018
yuanyin committed R131:03da51a5a4b2: Add an integrated script for dataset construction. (authored by yuanyin).
Add an integrated script for dataset construction.
Mar 12 2018
Mar 12 2018
Better lists of repositories.
Move scripts to 'scripts' folder.
yuanyin committed R131:b07124ec8013: Fix the issue when Linguist fails to dump JSON file. (authored by yuanyin).
Fix the issue when Linguist fails to dump JSON file.
Mar 11 2018
Mar 11 2018
yuanyin committed R131:b7e1d93ea579: Change lister to new lister using Github API. (authored by yuanyin).
Change lister to new lister using Github API.
yuanyin committed R131:a88e7a457530: Script for classifying files by language (authored by yuanyin).
Script for classifying files by language
Mar 10 2018
Mar 10 2018
yuanyin committed R131:ffec09c9eeaa: Add scripts for listing and cloning repositories. (authored by yuanyin).
Add scripts for listing and cloning repositories.
Mar 3 2018
Mar 3 2018
Move review to docs
yuanyin committed R131:351474f56dbc: import template from swh-py-template (init-py-repo) (authored by yuanyin).
import template from swh-py-template (init-py-repo)
Mar 1 2018
Mar 1 2018
yuanyin committed R131:d28907521989: Add review of selected articles and tools. (authored by yuanyin).
Add review of selected articles and tools.