diff --git a/common/modules/swh-ardc.org b/common/modules/swh-ardc.org index b2e71d5..b536360 100644 --- a/common/modules/swh-ardc.org +++ b/common/modules/swh-ardc.org @@ -1,364 +1,421 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) # # Software is all around us # #+INCLUDE: "prelude.org" :minlevel 1 #+INCLUDE: "169.org" * Source code pillar of Open Science, and how Software Heritage addresses ARDC :PROPERTIES: :CUSTOM_ID: main :END: ** Source code is /special/ (software is /not/ data) :PROPERTIES: :CUSTOM_ID: swnotdata :END: *** /Executable/ and /human readable/ knowledge \hfill copyright law :noexport: /“Programs must be written for people to read, and only incidentally for machines to execute.”/\\ \hfill Harold Abelson #+BEAMER: \pause *** Software /evolves/ over time - projects may last decades - the /development history/ is key to its /understanding/ #+BEAMER: \pause *** Complexity :B_picblock: :PROPERTIES: :BEAMER_env: picblock :BEAMER_OPT: pic=python3-matplotlib.pdf, width=.6\linewidth :END: - /millions/ of lines of code - large /web of dependencies/ + easy to break, difficult to maintain + /research software/ a thin top layer - sophisticated /developer communities/ *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** Precious, endangered /executable/ and /human readable/ knowledge key people *passing away*, platforms (GoogleCode, Gitorious, etc.) closing down ...\\ \hfill no organised effort to catalog and archive it ** Source code is /special/, cont'd :PROPERTIES: :CUSTOM_ID: swnotdatacontd :END: *** Software is complex - Structure :: monolithic/composite; self-contained/external dependencies - Lifetime :: one-shot/long term - Community :: one man/one team/distributed community - Authorship :: multiple roles: Architecture, Management, Development, Documentation, Testing, ... - Authority :: institutions/organizations/communities/single person #+BEAMER: \pause *** Versioning, granularity - Project :: “Inria created OCaml and Scikit-learn”\pause - Release :: “2D Voronoi Diagrams were introduced in CGAL 3.1.0”\pause - Exact state of a project :: “This result was produced using commit 0064fbd...”\pause - Code fragment :: “The core algorithm is in lines 101 to 143 of the file parmap.ml contained in the precise state of the project corresponding to commit 0064fbd....” ** Software Source code: pillar of Open Science, multiple needs :PROPERTIES: :CUSTOM_ID: pillaropensciencecompact :END: *** Three pillars of Open Science :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .4 :END: #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{1.2\linewidth} file:PreservationTriangle.png #+latex: \end{center} #+BEAMER: \pause *** A plurality of needs :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .6 :END: - Researcher :: - *archive* and *reference* software used in articles - *find* useful software - get *credit* for developed software - verify/reproduce/improve results #+BEAMER: \pause - Laboratory/team :: track software contributions - produce reports / web page #+BEAMER: \pause - Research Organization :: know its *software assets* - technology *transfer* - impact *metrics* ** Software Source code: a pillar of Open Science :PROPERTIES: :CUSTOM_ID: pillaropenscience :END: #+BEAMER: \vspace{-.5em} *** Software powers modern research :B_picblock: :PROPERTIES: :BEAMER_opt: pic=papermountain, leftpic=true, width=.4\linewidth :BEAMER_env: picblock :BEAMER_COL: .64 :END: #+BEGIN_QUOTE [...] software [...] essential in their fields. \mbox{}\hfill Top 100 papers (Nature, 2014) \vspace{.5em} #+END_QUOTE #+BEGIN_QUOTE Sometimes, if you dont have the software, you dont have the data \mbox{}\hfill Christine Borgman, Paris, 2018 #+END_QUOTE # http://www.nature.com/news/the-top-100-papers-1.16224 #+BEAMER: \pause *** Missing pillar: software (source code) :B_block: :PROPERTIES: :BEAMER_COL: .42 :BEAMER_env: block :END: #+latex: \begin{center} #+ATTR_LATEX: :width \extblockscale{1.2\linewidth} file:preservation_triangle_color.png #+latex: \end{center} #+BEAMER: \pause \hfill The links in the picture are *important* *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+BEAMER: \pause *** Nota Bene software may be a /tool/, a /research outcome/ and a /research object/\pause\\ \hfill access to the /source code/ is essential! ** A plurality of needs :PROPERTIES: :CUSTOM_ID: userneeds :END: *** Researchers **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .58 :END: - *archive* and *reference* software used in articles - *find* useful software **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .46 :END: - get *credit* for developed software - verify, *reproduce*, improve results #+BEAMER: \pause *** Laboratories/teams **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .4 :END: - *track* software contributions **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .5 :END: - produce reports - maintain web page #+BEAMER: \pause *** Research Organization know its *software assets* **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .4 :END: + technology *transfer* + impact *metrics* **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .5 :END: + funding *strategy* + career *evaluation* ** What is at stake: ARDC \hfill in increasing order of difficulty :PROPERTIES: :CUSTOM_ID: ardc :END: *** Archive Research software artifacts must be properly *archived*\\ \hfill make sure we can /retrieve/ them (/reproducibility/) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: \vspace{-.5em} *** Reference Research software artifacts must be properly *referenced*\\ \hfill make sure we can /identify/ them (/reproducibility/) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: \vspace{-.5em} *** Describe Research software artifacts must be properly *described*\\ \hfill make it easy to /discover/ and /reuse/ them (/visibility/) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: \vspace{-.5em} *** Cite/Credit Research software artifacts must be properly *cited* /(not the same as referenced!)/\\ \hfill to give /credit/ to authors (/evaluation/!) *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: +** What is at stake: before ARDC + :PROPERTIES: + :CUSTOM_ID: beforeardc + :END: +*** Development practices and tools + - version control system + - key metadata information (README, AUTHORS, LICENCE, etc.) + - build system + - test suites + - continuous integration + - ... +*** Opening up + - documentation + - community building + - ... +*** + needs proper training, and identification of best practices ** What is at stake: beyond ARDC :PROPERTIES: :CUSTOM_ID: beyondardc :END: *** Policy framework for dissemination, reuse, evaluation and recognition Define and promote an open source policy for publicly funded research software, including incentives and recognition for researchers and engineers #+BEAMER: \pause *** Sustainability Organisational schemas, legal tools, economic models, processes and policies to ensure research software can be maintained and sustained over time #+BEAMER: \pause *** Technology transfer and industry collaboration Approaches, support, methods, processes to establish connections with industry in order to foster uptake and transfer of research software #+BEAMER: \pause *** Advanced technologies and tools software quality reproducibility, and traceability (including plagiarism detection) +** What is at stake: beyond ARDC + :PROPERTIES: + :CUSTOM_ID: beyondardc-evaluation + :END: +*** Sustainability, technology transfer + Organisational schemas, legal tools, economic models, processes and policies + to ensure research software can be maintained and sustained over time, maybe + in connection with industry +#+BEAMER: \pause +*** Evaluation (funding, careers, etc.) \hfill beware of /naive software citation counting/! + + human-in-the-loop evaluation (see the [[https://www-enseignementsup--recherche-gouv-fr.translate.goog/fr/remise-des-prix-science-ouverte-du-logiciel-libre-de-la-recherche-83576?_x_tr_sl=fr&_x_tr_tl=en&_x_tr_hl=en-US&_x_tr_pto=wapp][French National Prize]]) + + identify /roles/ in software projects, see: + #+BEGIN_EXPORT latex + \begin{thebibliography}{Foo Bar, 1969} + \footnotesize + \bibitem{alliez:hal-02135891} P. Alliez, R. Di Cosmo, B. Guedj, A. Girault, M.-S. Hacid, A. Legrand and N. Rougier\newblock + \emph{Attributing and referencing (research) software: Best practices and outlook from Inria}, \newblock + CiSE 2020 \href{https://doi.ieeecomputersociety.org/10.1109/MCSE.2019.2949413}{(10.1109/MCSE.2019.2949413)} + \end{thebibliography} + #+END_EXPORT +#+BEAMER: \pause +*** Regulations are coming + software management plans, licensing, metadata and identification standards ** Addressing the four ARDC needs (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-ardc-short :END: *** Archive (10B+ files, 150M+ projects) :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: #+ATTR_LATEX: :width .8\linewidth file:swh-dataflow-merkle.pdf \vspace{-1em} #+BEAMER: \pause - [[https://save.softwareheritage.org][save.softwareheritage.org]] - [[https://deposit.softwareheritage.org][deposit.softwareheritage.org]] # (HAL, IPOL) #+BEAMER: \pause *** Reference (20 billion SWHIDs) :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][Intrinsic, decentralised, cryptographically strong identifiers, SWHIDs]] \vspace{-1em} #+ATTR_LATEX: :width 1.02\linewidth file:SWHID-v1.4_3.png Now supported [[https://www.softwareheritage.org/2020/05/13/swhid-adoption/][in SPDX 2.2, Wikidata]] etc. #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Describe :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - /Intrinsic metadata/ from source code - Contributed the [[https://codemeta.github.io/codemeta-generator/][Codemeta generator]] #+BEAMER: \pause *** Cite/Credit :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - Contributed /software citation/ style [[https://www.ctan.org/tex-archive/macros/latex/contrib/biblatex-contrib/biblatex-software][biblatex-software, v 1.2-2 now on CTAN]] ** Addressing the A(archive) in ARDC (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-a :END: #+latex: \vspace{-0.8em} -*** /Universal/ source code archive \hfill /not only research/ \hfill (11B+ files, 160M+ projects) +*** /Universal/ source code archive \hfill /not only research/ \hfill (12B+ files, 170M+ projects) :PROPERTIES: :BEAMER_env: block :END: #+ATTR_LATEX: :width .6\linewidth file:swh-dataflow-merkle.pdf #+latex: \vspace{-1em} - your research software /is likely there already/! #+BEAMER: \pause - anyone can trigger archival with [[https://save.softwareheritage.org][save.softwareheritage.org]] #+BEAMER: \pause - selected partners can push to the archive via [[https://deposit.softwareheritage.org][deposit.softwareheritage.org]] # (HAL, IPOL) ** Addressing the R(eference) in ARDC (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-r :END: #+latex: \vspace{-0.8em} *** Software Heritage Identifiers (SWHID) \hfill [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][link to full docs]] :B_block: :PROPERTIES: :BEAMER_env: block :END: 20+B [[https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/][intrinsic, decentralised, cryptographically strong identifiers, SWHIDs]] # #+INCLUDE: "../../common/modules/swh-id-syntax.org::#swh-id-syntax" :only-contents t :minlevel 3 #+LATEX: \centering%\forcebeamerstart #+LATEX: \mode{\only<1>{\includegraphics[width=0.8\linewidth]{SWHID-v1.4_1.png}}} #+LATEX: \mode{\only<2>{\includegraphics[width=0.8\linewidth]{SWHID-v1.4_2.png}}} #+LATEX: \only<3->{\includegraphics[width=0.8\linewidth]{SWHID-v1.4_3.png}} #+LATEX: %\forcebeamerend *** vspace :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+latex: \vspace{-0.5em} *** :PROPERTIES: :BEAMER_act: <4-> :BEAMER_env: block :END: Emerging standard : Linux Foundation [[https://spdx.github.io/spdx-spec/appendix-VI-external-repository-identifiers/#persistent-id][SPDX 2.2]]; IANA registered; WikiData [[https://www.wikidata.org/wiki/Property:P6138][P6138]] *** vspace :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: #+latex: \vspace{-0.5em} *** Full fledged /source code references/ for reproducibility :B_block: :PROPERTIES: :BEAMER_act: <5-> :BEAMER_env: block :END: Examples: [[https://archive.softwareheritage.org/swh:1:cnt:64582b78792cd6c2d67d35da5a11bb80886a6409;origin=https://github.com/virtualagc/virtualagc;lines=245-261/][Apollo 11 AGC excerpt]], [[https://archive.softwareheritage.org/swh:1:cnt:bb0faf6919fc60636b2696f32ec9b3c2adb247fe;origin=https://github.com/id-Software/Quake-III-Arena;lines=549-572/][Quake III rsqrt]]; Guidelines available, see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] ** Addressing D(escribe) and C(ite) in ARDC (see [[https://dx.doi.org/10.1007/978-3-030-52200-1_36][ICMS 2020]] for details) :PROPERTIES: :CUSTOM_ID: swh-dc :END: *** Describe :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - Collect /intrinsic metadata/ - Contributed the [[https://codemeta.github.io/codemeta-generator/][Codemeta generator]] #+ATTR_LATEX: :width .8\linewidth file:CodeMetaGenerator.png #+BEAMER: \pause *** Cite/Credit :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - Contributed /software citation/ style [[https://www.ctan.org/tex-archive/macros/latex/contrib/biblatex-contrib/biblatex-software][biblatex-software, v 1.2-2 now on CTAN]] #+ATTR_LATEX: :width .8\linewidth file:BibLaTeX-swh.png + +** ARDC Best practices + :PROPERTIES: + :CUSTOM_ID: ardc-best-france + :END: +*** Archiving and referencing + For *all source code* used in research (/yes, even small scripts!/) + - ensure it is archived in Software Heritage (see [[https://save.softwareheritage.org/][save code now]]) + - get the proper *SWHID* for your software (see demo and [[https://www.softwareheritage.org/howto-archive-and-reference-your-code/][detailed HOWTO]]) + - add it to research articles for reproducibility (see demo and [[https://www.softwareheritage.org/howto-archive-and-reference-your-code/][detailed HOWTO]]) + #+BEAMER: \pause +*** Describing and Citing/Crediting + For *software you want to put forward* (/mention in your CV, reports, etc., get citations and credit for it/), + do the following *extra steps*: + - add *codemeta.json* with description (see demo and the [[https://codemeta.github.io/codemeta-generator/][codemeta generator]]) + - reference it in the HAL portal (see demo and online documentation) + - cite software using the [[https://ctan.org/pkg/biblatex-software][biblatex-software]] package (in CTAN and TeXLive)