Page MenuHomeSoftware Heritage

Add end to end tests for content rendering
ClosedPublic

Authored by anlambert on Jul 1 2019, 6:01 PM.

Details

Summary

That diff aims to test HTML rendering of the contents stored inside the archive.
The purpose is to ensure no regressions will appear when some JavaScript dependencies
or our custom frontend code evolve.

A first batch of tests will check that almost all the programming languages supported by the
highlightjs library are correctly highlighted.

The other tests will check that some special contents (images, pdf files, ...) will be
correctly rendered.

In order to provide relevant input data to these new tests, some new endpoints are added
to swh-web for that purpose (only available when running the end to end tests).

Related T1845

Diff Detail

Repository
rDWAPPS Web applications
Branch
content-rendering-e2e-tests
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 6804
Build 9523: Cypress tests for swh-web diffsJenkins
Build 9522: tox-on-jenkinsJenkins
Build 9521: arc lint + arc unit

Event Timeline

vlorentz requested changes to this revision.Jul 1 2019, 6:14 PM
vlorentz added a subscriber: vlorentz.

Please add comments on the code added to swh/web/tests/data.py, I have no idea what it does.

And you are bundling lots of non-trivial code, some with no authoring attribution or license. I don't think we need all these tests. Testing a few languages and formats (and trusting highlightjs to do its job on the rest) should be enough.

swh/web/common/highlightjs.py
44

Where is that list coming from?

This revision now requires changes to proceed.Jul 1 2019, 6:14 PM

Please add comments on the code added to swh/web/tests/data.py, I have no idea what it does.

So the WIP tag ...

And you are bundling lots of non-trivial code, some with no authoring attribution or license. I don't think we need all these tests.

From my point of view, those tests need to be exhaustive as code highlighting is one of the most important feature of
swh-web/browse (and we are a source code archive right ? we should be able to highlight correctly a lot of contents)
I can try to find better source files with proper licenses in it but it will be complicated to find some for some format.

Testing a few languages and formats (and trusting highlightjs to do its job on the rest) should be enough.

I do not rely on the automatic language detection of highlightjs as it does not work so great and it implies
reading the source file content client-side (not really optimal in terms of performance). I use heuristics based
on file extensions, filenames or content mime types. This is not the perfect solution but it works pretty great.
Of course, this could be improved in the future.

swh/web/common/highlightjs.py
44

Generated from the highlightjs language aliases and completed by hand.

swh/web/common/highlightjs.py
44

Would it be possible to send highlightjs a PR to add the aliases; and bundle their data file directly?

Update:

  • rebase to master
  • add remaining programming languages to test highlighting
  • add test for Jupyter notebook rendering
anlambert retitled this revision from [WIP] Add end to end tests for content rendering to Add end to end tests for content rendering.Jul 8 2019, 5:38 PM
anlambert edited the summary of this revision. (Show Details)

For your information, I have also tested how reliable the automatic language detection feature of the highlight.js library is.
Below are the results (tests without timing have failed):

Code highlighting tests
    1) should highlight source files with extension R
    2) should highlight source files with extension abnf
    ✓ should highlight source files with extension adb (1478ms)
    ✓ should highlight source files with extension adoc (1171ms)
    ✓ should highlight source files with extension ads (1783ms)
    3) should highlight source files with extension ahk
    4) should highlight source files with extension aj
    ✓ should highlight source files with extension applescript (694ms)
    5) should highlight source files with extension as
    ✓ should highlight source files with extension au3 (1522ms)
    6) should highlight source files with extension awk
    7) should highlight source files with extension bas
    ✓ should highlight source files with extension bat (836ms)
    ✓ should highlight source files with extension bf (783ms)
    8) should highlight source files with extension bnf
    ✓ should highlight source files with extension bsl (1231ms)
    ✓ should highlight source files with extension cal (1096ms)
    ✓ should highlight source files with extension capnp (1431ms)
    ✓ should highlight source files with extension cc (1183ms)
    9) should highlight source files with extension ceylon
    ✓ should highlight source files with extension clj (773ms)
    ✓ should highlight source files with extension cls (840ms)
    ✓ should highlight source files with extension cmake (941ms)
    ✓ should highlight source files with extension coffee (946ms)
    10) should highlight source files with extension cpp
    11) should highlight source files with extension cr
    ✓ should highlight source files with extension cs (872ms)
    ✓ should highlight source files with extension css (878ms)
    ✓ should highlight source files with extension d (1132ms)
    12) should highlight source files with extension dart
    ✓ should highlight source files with extension dcl (1453ms)
    13) should highlight source files with extension dfm
    14) should highlight source files with extension diff
    ✓ should highlight source files with extension do (1090ms)
    ✓ should highlight source files with extension dts (928ms)
    ✓ should highlight source files with extension dust (1082ms)
    15) should highlight source files with extension ebnf
    16) should highlight source files with extension elm
    ✓ should highlight source files with extension ep (1130ms)
    ✓ should highlight source files with extension erb (604ms)
    ✓ should highlight source files with extension erl (959ms)
    ✓ should highlight source files with extension ex (1055ms)
    ✓ should highlight source files with extension exs (868ms)
    ✓ should highlight source files with extension f90 (1025ms)
    17) should highlight source files with extension feature
    18) should highlight source files with extension flix
    ✓ should highlight source files with extension fs (1528ms)
    ✓ should highlight source files with extension gcode (799ms)
    ✓ should highlight source files with extension glsl (872ms)
    ✓ should highlight source files with extension gml (2535ms)
    19) should highlight source files with extension gms
    ✓ should highlight source files with extension go (1012ms)
    20) should highlight source files with extension golo
    21) should highlight source files with extension gradle
    22) should highlight source files with extension groovy
    23) should highlight source files with extension gss
    24) should highlight source files with extension haml
    25) should highlight source files with extension hbs
    26) should highlight source files with extension hs
    ✓ should highlight source files with extension hsp (1111ms)
    27) should highlight source files with extension hx
    ✓ should highlight source files with extension hy (2825ms)
    ✓ should highlight source files with extension icl (1104ms)
    28) should highlight source files with extension ini
    29) should highlight source files with extension ino
    30) should highlight source files with extension java
    31) should highlight source files with extension jl
    32) should highlight source files with extension js
    ✓ should highlight source files with extension json (1033ms)
    33) should highlight source files with extension kt
    34) should highlight source files with extension lasso
    35) should highlight source files with extension lc
    ✓ should highlight source files with extension ldif (802ms)
    ✓ should highlight source files with extension leaf (1086ms)
    ✓ should highlight source files with extension less (1026ms)
    36) should highlight source files with extension lisp
    ✓ should highlight source files with extension ll (1041ms)
    ✓ should highlight source files with extension ls (938ms)
    ✓ should highlight source files with extension lsl (1132ms)
    ✓ should highlight source files with extension lua (1030ms)
    37) should highlight source files with extension m
    ✓ should highlight source files with extension md (1461ms)
    ✓ should highlight source files with extension mel (1027ms)
    38) should highlight source files with extension mk
    39) should highlight source files with extension ml
    40) should highlight source files with extension moon
    41) should highlight source files with extension nb
    42) should highlight source files with extension nim
    ✓ should highlight source files with extension nix (779ms)
    ✓ should highlight source files with extension nsi (1502ms)
    43) should highlight source files with extension p
    ✓ should highlight source files with extension pas (1044ms)
    44) should highlight source files with extension pbi
    ✓ should highlight source files with extension pde (1964ms)
    45) should highlight source files with extension php
    46) should highlight source files with extension pl
    47) should highlight source files with extension pm
    48) should highlight source files with extension pony
    49) should highlight source files with extension pp
    50) should highlight source files with extension properties
    ✓ should highlight source files with extension proto (1654ms)
    51) should highlight source files with extension ps1
    ✓ should highlight source files with extension py (1057ms)
    52) should highlight source files with extension q
    53) should highlight source files with extension qml
    ✓ should highlight source files with extension rb (1220ms)
    ✓ should highlight source files with extension re (1053ms)
    54) should highlight source files with extension rib
    55) should highlight source files with extension rs
    ✓ should highlight source files with extension rsc (1811ms)
    56) should highlight source files with extension s
    ✓ should highlight source files with extension sas (1004ms)
    57) should highlight source files with extension scad
    ✓ should highlight source files with extension scala (1621ms)
    58) should highlight source files with extension sci
    ✓ should highlight source files with extension scm (972ms)
    ✓ should highlight source files with extension scss (1143ms)
    ✓ should highlight source files with extension sh (1335ms)
    ✓ should highlight source files with extension sl (1318ms)
    ✓ should highlight source files with extension smali (1371ms)
    ✓ should highlight source files with extension sml (807ms)
    ✓ should highlight source files with extension sqf (1264ms)
    59) should highlight source files with extension st
    ✓ should highlight source files with extension stan (898ms)
    60) should highlight source files with extension styl
    ✓ should highlight source files with extension subunit (1019ms)
    ✓ should highlight source files with extension swift (977ms)
    ✓ should highlight source files with extension tap (1231ms)
    61) should highlight source files with extension tcl
    ✓ should highlight source files with extension tex (1145ms)
    ✓ should highlight source files with extension thrift (854ms)
    62) should highlight source files with extension ts
    ✓ should highlight source files with extension v (2515ms)
    63) should highlight source files with extension vala
    ✓ should highlight source files with extension vb (1599ms)
    64) should highlight source files with extension vbs
    ✓ should highlight source files with extension vhd (5093ms)
    65) should highlight source files with extension vim
    66) should highlight source files with extension wl
    67) should highlight source files with extension xml
    ✓ should highlight source files with extension xqy (2337ms)
    68) should highlight source files with extension yml
    ✓ should highlight source files with extension zep (1373ms)
    ✓ should highlight source files with filenames .htaccess (1220ms)
    69) should highlight source files with filenames CMakeLists.txt
    70) should highlight source files with filenames Dockerfile
    71) should highlight source files with filenames Makefile
    ✓ should highlight source files with filenames access.log (1211ms)
    ✓ should highlight source files with filenames httpd.conf (1338ms)
    72) should highlight source files with filenames nginx.conf
    ✓ should highlight source files with filenames nginx.log (940ms)
    ✓ should highlight source files with filenames pf.conf (993ms)
    73) should highlight source files with filenames resolv.conf

So 73/152 tests are failing when the heuristics I have implemented to associate a programming language given a filename are not used.
This comforts me in the fact that using heuristics was the most reasonable approach.

This revision now requires changes to proceed.Jul 11 2019, 1:40 PM
kalpitk added inline comments.
cypress/integration/content-rendering.spec.js
8

Should we move this to cypress/support/index.js so that it can be accessed by D1756 also?

cypress/integration/content-rendering.spec.js
8

I will rather move it in the cypress/integration/utils/index.js file. As this function is not needed in every tests, I think it is better to import it only when required. I will update that diff today with that change (and also rebase it on master).

Update:

  • rebase
  • fix failing cypress tests
  • move checkLanguageHighlighting function in the utils module

Next step: Write simple Hello World files or reuse code example from highlight.js demo for each language to test highlighting instead of using random source files downloaded from the web.

Update:

  • really fix cypress tests that were failing
  • Use different test source files (taken from the highlight.js demo website) to avoid license issues

This Diff should now be in a landable state.

Update: ensure the cypress/fixtures folder exists as tests will fail otherwise

Update: Ensure there is no remaining test source files with explicit licensing in it

Update: Every test source files are now taken from the highlight.js demo to ease their attribution

This revision was not accepted when it landed; it landed in state Needs Review.Aug 5 2019, 11:14 AM
This revision was automatically updated to reflect the committed changes.