diff --git a/talks-public/2019-04-03-RDA-WG/2019-04-03_RDA-WG.org b/talks-public/2019-04-03-RDA-WG/2019-04-03_RDA-WG.org new file mode 100644 index 0000000..92ce25e --- /dev/null +++ b/talks-public/2019-04-03-RDA-WG/2019-04-03_RDA-WG.org @@ -0,0 +1,334 @@ +#+COLUMNS: %40ITEM %10BEAMER_env(Env) %9BEAMER_envargs(Env Args) %10BEAMER_act(Act) %4BEAMER_col(Col) %10BEAMER_extra(Extra) %8BEAMER_opt(Opt) +#+TITLE: Software Source Code Identification +#+SUBTITLE: Working Group +#+AUTHOR: Roberto Di Cosmo +#+EMAIL: roberto@dicosmo.org @rdicosmo @swheritage +#+BEAMER_HEADER: \date{April 25nd, 2019} +#+BEAMER_HEADER: \title[www.softwareheritage.org]{Identifiers for Digital Objects} +#+BEAMER_HEADER: \author[Roberto Di Cosmo \hspace{5em} www.dicosmo.org]{{\bf Roberto Di Cosmo}, Daniel Katz, Martin Fenner} +# #+BEAMER_HEADER: \setbeameroption{show notes on second screen} +#+BEAMER_HEADER: \setbeameroption{hide notes} +#+KEYWORDS: software heritage legacy preservation knowledge mankind technology +#+LATEX_HEADER: \usepackage{tcolorbox} +# +# prelude.org contains all the information needed to export the main beamer latex source +# use prelude-toc.org to get the table of contents +# + +#+INCLUDE: "../../common/modules/prelude-toc.org" :minlevel 1 + + +#+INCLUDE: "../../common/modules/169.org" + +# +LaTeX_CLASS_OPTIONS: [aspectratio=169,handout,xcolor=table] + +#+LATEX_HEADER: \usepackage{bbding} +#+LATEX_HEADER: \DeclareUnicodeCharacter{66D}{\FiveStar} + +# +# If you want to change the title logo it's here +# +# +BEAMER_HEADER: \titlegraphic{\includegraphics[width=0.7\textwidth]{SWH-logo}} + +# aspect ratio can be changed, but the slides need to be adapted +# - compute a "resizing factor" for the images (macro for picblocks?) +# +# set the background image +# +# https://pacoup.com/2011/06/12/list-of-true-169-resolutions/ +# +#+BEAMER_HEADER: \pgfdeclareimage[height=90mm,width=160mm]{bgd}{swh-world-169.png} +#+BEAMER_HEADER: \setbeamertemplate{background}{\pgfuseimage{bgd}} +* Introduction +** Working group key facts +*** Spawned from + Software Source Code Interest Group +*** Co-chairs + - Roberto Di Cosmo + - Daniel Katz + - Martin Fenner +*** Objectives + - bring together people involved/interested in software identification + - produce concrete recommendations for the academic community +*** + Online document: http://bit.ly/rda13scidwg please register there +#+INCLUDE: "../../common/modules/rdc-bio.org::#main" :only-contents t :minlevel 2 +* Setting the stage +** Software is Knowledge + :PROPERTIES: + :CUSTOM_ID: softwareknowledge + :END: +*** Software is /an essential component/ of modern scientific research :B_picblock: + :PROPERTIES: + :BEAMER_opt: pic=papermountain,width=.25\linewidth + :BEAMER_env: picblock + :BEAMER_act: +- + :END: +Top 100 papers (Nature, October 2014)\\ +#+BEGIN_QUOTE +[...] the vast majority describe experimental methods or sofware that have become essential in their fields.\\ +#+END_QUOTE +http://www.nature.com/news/the-top-100-papers-1.16224 +** The source code is essential! + :PROPERTIES: + :CUSTOM_ID: thesourcecode + :END: + #+LATEX: \includegraphics[width=.10\linewidth]{software.png} +#+BEGIN_QUOTE + “The source code for a work means the preferred form of the work for making modifications to it." + \hfill GPL Licence +#+END_QUOTE +#+Beamer: \pause +*** + :PROPERTIES: + :BEAMER_env: block + :BEAMER_act: +- + :END: +#+latex: \begin{center} Hello World \end{center} +*** Program (excerpt of binary) :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.5 + :BEAMER_env: block + :BEAMER_act: +- + :END: +#+begin_src hex :exports code + 4004e6: 55 + 4004e7: 48 89 e5 + 4004ea: bf 84 05 40 00 + 4004ef: b8 00 00 00 00 + 4004f4: e8 c7 fe ff ff + 4004f9: 90 + 4004fa: 5d + 4004fb: c3 +#+end_src +*** Program (source code) :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.55 + :BEAMER_env: block + :BEAMER_act: +- + :END: + +#+begin_src c :exports code +/* Hello World program */ + +#include + +void main() +{ + printf("Hello World"); +} +#+end_src +** Software Source Code is /special/ + :PROPERTIES: + :CUSTOM_ID: softwareisdifferent + :END: +*** Harold Abelson, Structure and Interpretation of Computer Programs + /“Programs must be written for people to read, and only incidentally for machines to execute.”/ + +*** Quake 2 source code (excerpt) :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.45 + :BEAMER_env: block + :END: + #+LATEX: \includegraphics[width=\linewidth]{quake-carmack-sqrt-1.png} + # smart efficient implementation of 1/sqrt(x) on a CPU without special support +*** Net. queue in Linux (excerpt) :B_block:BMCOL: + :PROPERTIES: + :BEAMER_col: 0.45 + :BEAMER_env: block + :END: + #+LATEX: \includegraphics[width=\linewidth]{juliusz-sfb-short.png} + # Juliusz implementation of stochastic fair blue in the Linux Kernel linux/net/sched/sch_sfb.c + +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** Len Shustek, Computer History Museum + \hfill /“Source code provides a view into the mind of the designer.”/ +** Forgotten pillar of (Open) Science +*** Lack of recognition + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .5 + :END: + not (yet) a first class citizen + - in the EOSC plan + - in the EU copyright reform + - in the scholarly works +#+BEAMER: \pause +*** Lack of guidance/consensus on how to + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .5 + :END: + - choose a license + - cite a software project + - relate to industry best practices + - make source code FAIR(*) +#+BEAMER: \pause +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** Lack of basic prerequisites to reproducibility + See a discussion in \url{annex.softwareheritage.org/talks/2018/2018-09-17-STScI_public.pdf} + +** Interest in (research) software is raising +*** A wealth of activities in academia + - artifact evaluation :: \mbox{}\\ + now commonplace in CS conferences + - reproducible research :: \mbox{}\\ + hot area of interest (jury still out on how to really do this) + - software archival :: \mbox{}\\ + publishers, open access portals, propose their services + - academic credit :: \mbox{}\\ + research software authors want recognition +#+BEAMER: \pause +*** Identifiers + \hfill for all the above, proper *identifiers* are needed +** Challenges for academia +*** Accept the complexity: software is /special/ + - made by humans for humans: copyright law applies! + - not (just) data: you may have a nice hammer, but software is not a nail +#+BEAMER: \pause +*** Be humble: industry, developers, communities have been there + do not + - reinvent the wheel + - diverge from COPs +#+BEAMER: \pause +*** Let's start from ... + \hfill identifiers +* Identifying software source code +** Fragmented landscape +*** Academic initiatives :B_block: + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .6 + :END: + - Force 11 Software Citation Principles WG + - Freya EU project + - OpenAire EU project + - Publisher offerings + #+BEAMER: \pause +*** Industry initiatives :B_block: + :PROPERTIES: + :BEAMER_env: block + :BEAMER_COL: .4 + :END: + - NSRL (NIST) + - SPDX (Linux Foundation) + - SWID (ISO Standard) + - ... + #+BEAMER: \pause +*** :B_ignoreheading: + :PROPERTIES: + :BEAMER_env: ignoreheading + :END: +*** Transversal initiatives + - Software Heritage \hfill (disclosure: I'm leading it) + #+BEAMER: \pause +** Different motivations\hfill here are a few +*** Give credit + - citations :: that count for software authors + #+BEAMER: \pause +*** Support reproducible research + - references :: to retrieve the exact version of a software artefact used in a research + #+BEANER: \pause +*** Transparency + - software bill of materials :: enable traceability of software artefacts +** It is way more complex than it seems +*** All software projects are not born equal +**** :B_column: + :PROPERTIES: + :BEAMER_env: column + :BEAMER_COL: .45 + :END: + - structure :: \mbox{} + + monolithic + + composite + - lifetime :: \mbox{} + + one shot + + long running + - community :: \mbox{} + + single developer + + large community +**** :B_column: + :PROPERTIES: + :BEAMER_env: column + :BEAMER_COL: .45 + :END: + - authorship :: \mbox{} + + plurality of roles + + difficulty of evaluating contributions + - authority :: \mbox{} + + just the commit log + + top down + + institution + #+BEAMER: \pause +*** Bottomline + /software citation/ is much more than ... \hfill /software identification/! +* VALID UP TO HERE: WHAT FOLLOWS NEED TO BE REWORKED +* Agenda + - Introduction and motivation (15m, done) + - Group work and discussion on the objectives of source code identification (20m) + - Conceptual framework for source code identification (15m, done) + - Presentation of an initial state-of-the-art in the area of software source code identification (15m) + - Group work on a document describing the state-of-the-art (20m) + - Wrap up: summary of results and next steps (10m) + +* The Software Heritage initiative +#+INCLUDE: "../../common/modules/swh-goals-oneslide-vertical.org::#goals" :minlevel 2 +** A principled infrastructure \hfill \url{http://bit.ly/swhpaper} + #+latex: \begin{center} + #+ATTR_LATEX: :width 0.5\linewidth + file:SWH-as-foundation-slim.png + #+latex: \end{center} + #+BEAMER: \pause + #+latex: \centering + #+ATTR_LATEX: :width \extblockscale{.7\linewidth} + file:growth.png + #+BEAMER: \pause +*** Technology + :PROPERTIES: + :BEAMER_col: 0.34 + :BEAMER_env: block + :END: + - transparency and FOSS + - replicas all the way down +*** Content (billions!) + :PROPERTIES: + :BEAMER_col: 0.32 + :BEAMER_env: block + :END: + - *intrinsic identifiers* + - facts and provenance +*** Organization + :PROPERTIES: + :BEAMER_col: 0.33 + :BEAMER_env: block + :END: + - non-profit + - multi-stakeholder +* Looking for the right PIDs + #+INCLUDE: "../../common/modules/swh-pids.org::#main" :only-contents t +* Demo time +** A "wayback machine" for software source code +*** Identifiers in action + - *\url{http://archive.softwareheritage.org/browse}* +* Conclusion +** Conclusion \hfill @swheritage + #+BEAMER: \vspace{-1mm} +*** + - there are many systems of identifiers + - DIOs and IDOs cater to different needs + - IDOs enable *integrity* and *no middle man* properties *together* + - Software Heritage is using IDOs for billions of objects, *today* + - we believe IDOs are appropriate for most *digital born* content that has a *canonical* representation + #+BEAMER: \vspace{-1mm} +*** Come in, we're open! + \url{www.softwareheritage.org} --- learn more \\ + \url{www.softwareheritage.org/support/sponsors/} --- sponsoring info \\ + \url{www.softwareheritage.org/support/partners} --- partners \\ + \url{forge.softwareheritage.org} --- our own code + #+BEAMER: \vspace{-1mm} \flushright {\Huge Questions?} \vfill diff --git a/talks-public/2019-04-03-RDA-WG/Makefile b/talks-public/2019-04-03-RDA-WG/Makefile new file mode 100644 index 0000000..68fbee7 --- /dev/null +++ b/talks-public/2019-04-03-RDA-WG/Makefile @@ -0,0 +1 @@ +include ../Makefile.slides diff --git a/talks-public/2019-04-03-RDA-WG/TODO b/talks-public/2019-04-03-RDA-WG/TODO new file mode 100644 index 0000000..d68ae0f --- /dev/null +++ b/talks-public/2019-04-03-RDA-WG/TODO @@ -0,0 +1,15 @@ +Add + + use cases + + reproducibility + + citation + + differences --> IDO/DIO + + + citation is a complex problem + + variability + + moderation + + autority + + + law + + copyright + +