diff --git a/talks-public/2019-04-03-RDA-WG/2019-04-03_RDA-WG.org b/talks-public/2019-04-03-RDA-WG/2019-04-03_RDA-WG.org index 92ce25e..2069f82 100644 --- a/talks-public/2019-04-03-RDA-WG/2019-04-03_RDA-WG.org +++ b/talks-public/2019-04-03-RDA-WG/2019-04-03_RDA-WG.org @@ -1,334 +1,334 @@ #+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) #+TITLE: Software Source Code Identification #+SUBTITLE: Working Group #+AUTHOR: Roberto Di Cosmo #+EMAIL: roberto@dicosmo.org @rdicosmo @swheritage #+BEAMER_HEADER: \date{April 25nd, 2019} #+BEAMER_HEADER: \title[www.softwareheritage.org]{Identifiers for Digital Objects} #+BEAMER_HEADER: \author[Roberto Di Cosmo \hspace{5em} www.dicosmo.org]{{\bf Roberto Di Cosmo}, Daniel Katz, Martin Fenner} # #+BEAMER_HEADER: \setbeameroption{show notes on second screen} #+BEAMER_HEADER: \setbeameroption{hide notes} #+KEYWORDS: software heritage legacy preservation knowledge mankind technology #+LATEX_HEADER: \usepackage{tcolorbox} # # prelude.org contains all the information needed to export the main beamer latex source # use prelude-toc.org to get the table of contents # #+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 #+INCLUDE: "../../common/modules/169.org" # +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] #+LATEX_HEADER: \usepackage{bbding} #+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} # # If you want to change the title logo it's here # # +BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.7\textwidth]{SWH-logo}} # aspect ratio can be changed, but the slides need to be adapted # - compute a "resizing factor" for the images (macro for picblocks?) # # set the background image # # https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ # #+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} #+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} * Introduction ** Working group key facts -*** Spawned from - Software Source Code Interest Group +*** Joint RDA & FORCE11 WG which spawned from + RDA's Software Source Code IG & FORCE11's SCIWG *** Co-chairs - Roberto Di Cosmo - Daniel Katz - Martin Fenner *** Objectives - bring together people involved/interested in software identification - produce concrete recommendations for the academic community -*** +*** Online document: http://bit.ly/rda13scidwg please register there #+INCLUDE: "../../common/modules/rdc-bio.org::#main" :only-contents t :minlevel 2 * Setting the stage ** Software is Knowledge :PROPERTIES: :CUSTOM_ID: softwareknowledge :END: *** Software is /an essential component/ of modern scientific research :B_picblock: :PROPERTIES: :BEAMER_opt: pic=papermountain,width=.25\linewidth :BEAMER_env: picblock :BEAMER_act: +- - :END: + :END: Top 100 papers (Nature, October 2014)\\ #+BEGIN_QUOTE [...] the vast majority describe experimental methods or sofware that have become essential in their fields.\\ #+END_QUOTE http://www.nature.com/news/the-top-100-papers-1.16224 ** The source code is essential! :PROPERTIES: :CUSTOM_ID: thesourcecode :END: #+LATEX: \includegraphics[width=.10\linewidth]{software.png} -#+BEGIN_QUOTE +#+BEGIN_QUOTE “The source code for a work means the preferred form of the work for making modifications to it." \hfill GPL Licence #+END_QUOTE #+Beamer: \pause -*** +*** :PROPERTIES: :BEAMER_env: block :BEAMER_act: +- :END: #+latex: \begin{center} Hello World \end{center} *** Program (excerpt of binary) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.5 :BEAMER_env: block :BEAMER_act: +- :END: #+begin_src hex :exports code - 4004e6: 55 - 4004e7: 48 89 e5 - 4004ea: bf 84 05 40 00 - 4004ef: b8 00 00 00 00 - 4004f4: e8 c7 fe ff ff - 4004f9: 90 - 4004fa: 5d - 4004fb: c3 + 4004e6: 55 + 4004e7: 48 89 e5 + 4004ea: bf 84 05 40 00 + 4004ef: b8 00 00 00 00 + 4004f4: e8 c7 fe ff ff + 4004f9: 90 + 4004fa: 5d + 4004fb: c3 #+end_src *** Program (source code) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.55 :BEAMER_env: block :BEAMER_act: +- :END: #+begin_src c :exports code /* Hello World program */ #include void main() { printf("Hello World"); } #+end_src ** Software Source Code is /special/ :PROPERTIES: :CUSTOM_ID: softwareisdifferent :END: *** Harold Abelson, Structure and Interpretation of Computer Programs /“Programs must be written for people to read, and only incidentally for machines to execute.”/ *** Quake 2 source code (excerpt) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.45 :BEAMER_env: block :END: #+LATEX: \includegraphics[width=\linewidth]{quake-carmack-sqrt-1.png} # smart efficient implementation of 1/sqrt(x) on a CPU without special support *** Net. queue in Linux (excerpt) :B_block:BMCOL: :PROPERTIES: :BEAMER_col: 0.45 :BEAMER_env: block :END: #+LATEX: \includegraphics[width=\linewidth]{juliusz-sfb-short.png} # Juliusz implementation of stochastic fair blue in the Linux Kernel linux/net/sched/sch_sfb.c *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Len Shustek, Computer History Museum \hfill /“Source code provides a view into the mind of the designer.”/ ** Forgotten pillar of (Open) Science -*** Lack of recognition +*** Lack of recognition :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: not (yet) a first class citizen - in the EOSC plan - in the EU copyright reform - in the scholarly works #+BEAMER: \pause *** Lack of guidance/consensus on how to :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .5 :END: - choose a license - cite a software project - relate to industry best practices - make source code FAIR(*) #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Lack of basic prerequisites to reproducibility - See a discussion in \url{annex.softwareheritage.org/talks/2018/2018-09-17-STScI_public.pdf} - + See a discussion in \url{http://annex.softwareheritage.org/public/talks/2018/2018-09-17-STScI_public.pdf} + ** Interest in (research) software is raising *** A wealth of activities in academia - artifact evaluation :: \mbox{}\\ now commonplace in CS conferences - reproducible research :: \mbox{}\\ hot area of interest (jury still out on how to really do this) - software archival :: \mbox{}\\ publishers, open access portals, propose their services - academic credit :: \mbox{}\\ research software authors want recognition #+BEAMER: \pause *** Identifiers \hfill for all the above, proper *identifiers* are needed ** Challenges for academia *** Accept the complexity: software is /special/ - made by humans for humans: copyright law applies! - not (just) data: you may have a nice hammer, but software is not a nail #+BEAMER: \pause *** Be humble: industry, developers, communities have been there do not - reinvent the wheel - diverge from COPs #+BEAMER: \pause *** Let's start from ... \hfill identifiers * Identifying software source code ** Fragmented landscape *** Academic initiatives :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .6 :END: - Force 11 Software Citation Principles WG - Freya EU project - OpenAire EU project - Publisher offerings #+BEAMER: \pause *** Industry initiatives :B_block: :PROPERTIES: :BEAMER_env: block :BEAMER_COL: .4 :END: - NSRL (NIST) - SPDX (Linux Foundation) - SWID (ISO Standard) - ... #+BEAMER: \pause *** :B_ignoreheading: :PROPERTIES: :BEAMER_env: ignoreheading :END: *** Transversal initiatives - Software Heritage \hfill (disclosure: I'm leading it) - #+BEAMER: \pause + ** Different motivations\hfill here are a few *** Give credit - citations :: that count for software authors #+BEAMER: \pause -*** Support reproducible research +*** Support reproducible research and reuse - references :: to retrieve the exact version of a software artefact used in a research - #+BEANER: \pause + #+BEAMER: \pause *** Transparency - software bill of materials :: enable traceability of software artefacts ** It is way more complex than it seems *** All software projects are not born equal **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .45 :END: - - structure :: \mbox{} + - structure :: \mbox{} + monolithic + composite - lifetime :: \mbox{} + one shot + long running - community :: \mbox{} + single developer + large community **** :B_column: :PROPERTIES: :BEAMER_env: column :BEAMER_COL: .45 :END: - authorship :: \mbox{} + plurality of roles + difficulty of evaluating contributions - authority :: \mbox{} + just the commit log + top down + institution #+BEAMER: \pause *** Bottomline /software citation/ is much more than ... \hfill /software identification/! * VALID UP TO HERE: WHAT FOLLOWS NEED TO BE REWORKED * Agenda - Introduction and motivation (15m, done) - Group work and discussion on the objectives of source code identification (20m) - Conceptual framework for source code identification (15m, done) - Presentation of an initial state-of-the-art in the area of software source code identification (15m) - Group work on a document describing the state-of-the-art (20m) - Wrap up: summary of results and next steps (10m) * The Software Heritage initiative #+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :minlevel 2 ** A principled infrastructure \hfill \url{http://bit.ly/swhpaper} #+latex: \begin{center} #+ATTR_LATEX: :width 0.5\linewidth file:SWH-as-foundation-slim.png #+latex: \end{center} #+BEAMER: \pause #+latex: \centering #+ATTR_LATEX: :width \extblockscale{.7\linewidth} file:growth.png #+BEAMER: \pause *** Technology :PROPERTIES: :BEAMER_col: 0.34 :BEAMER_env: block :END: - transparency and FOSS - replicas all the way down *** Content (billions!) :PROPERTIES: :BEAMER_col: 0.32 :BEAMER_env: block :END: - *intrinsic identifiers* - facts and provenance *** Organization :PROPERTIES: :BEAMER_col: 0.33 :BEAMER_env: block :END: - non-profit - multi-stakeholder -* Looking for the right PIDs +* Looking for the right PIDs #+INCLUDE: "../../common/modules/swh-pids.org::#main" :only-contents t * Demo time ** A "wayback machine" for software source code *** Identifiers in action - *\url{http://archive.softwareheritage.org/browse}* * Conclusion ** Conclusion \hfill @swheritage #+BEAMER: \vspace{-1mm} -*** +*** - there are many systems of identifiers - DIOs and IDOs cater to different needs - IDOs enable *integrity* and *no middle man* properties *together* - Software Heritage is using IDOs for billions of objects, *today* - we believe IDOs are appropriate for most *digital born* content that has a *canonical* representation #+BEAMER: \vspace{-1mm} *** Come in, we're open! \url{www.softwareheritage.org} --- learn more \\ \url{www.softwareheritage.org/support/sponsors/} --- sponsoring info \\ \url{www.softwareheritage.org/support/partners} --- partners \\ \url{forge.softwareheritage.org} --- our own code #+BEAMER: \vspace{-1mm} \flushright {\Huge Questions?} \vfill