Page Menu
Home
Software Heritage
Search
Configure Global Search
Log In
Files
F8393202
README.md
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
9 KB
Subscribers
None
README.md
View Options
#
Deploy
a
Software
Heritage
stack
with
docker
deploy
According
you
have
a
properly
set
up
docker
swarm
cluster
with
support
for
the
`
docker
deploy
`
command
,
e
.
g
.:
```
~/
swh
-
docker
$
docker
node
ls
ID
HOSTNAME
STATUS
AVAILABILITY
MANAGER
STATUS
ENGINE
VERSION
py47518uzdb94y2sb5yjurj22
host2
Ready
Active
18.09
.
7
n9mfw08gys0dmvg5j2bb4j2m7
*
host1
Ready
Active
Leader
18.09
.
7
```
Note
:
this
might
require
you
activate
experimental
features
of
docker
as
described
in
[
docker
deploy
](
https
:
//docs.docker.com/engine/reference/commandline/deploy/)
documentation
.
In
the
following
how
-
to
,
we
will
assume
that
the
service
`
STACK
`
name
is
`
swh
`
(
this
name
is
the
last
argument
of
the
`
docker
deploy
`
command
below
).
Several
preparation
steps
will
depend
on
this
name
.
##
Set
up
volumes
Before
starting
the
`
swh
`
service
,
you
may
want
to
specify
where
the
data
should
be
stored
on
your
docker
hosts
.
By
default
it
will
use
docker
volumes
for
storing
databases
and
the
content
of
the
objstorage
(
thus
put
them
in
`
/
var
/
lib
/
docker
/
volumes
`
.
If
you
want
to
specify
a
different
location
to
put
a
storage
in
,
create
the
storage
before
starting
the
docker
service
.
For
example
for
the
`
objstorage
`
service
you
will
need
a
storage
named
`
<
STACK
>
_objstorage
`
:
```
~/
swh
-
docker
$
docker
volume
create
-
d
local
\
--
opt
type
=
none
\
--
opt
o
=
bind
\
--
opt
device
=/
data
/
docker
/
swh
-
objstorage
\
swh_objstorage
```
If
you
want
to
deploy
services
like
the
`
swh
-
objstorage
`
on
several
hosts
,
you
will
a
shared
storage
area
in
which
blob
objects
will
be
stored
.
Typically
a
NFS
storage
can
be
used
for
this
.
This
is
not
covered
in
this
doc
.
Please
read
the
documentation
of
docker
volumes
to
learn
how
to
use
such
a
device
as
volume
proviver
for
docker
.
Note
that
the
provided
`
docker
-
compose
.
yaml
`
file
have
a
few
placement
constraints
,
for
example
the
`
objstorage
`
service
is
forced
to
be
spawn
on
the
master
node
of
the
docker
swarm
cluster
.
Feel
free
to
remove
/
amend
these
constraints
if
needed
.
##
Managing
secrets
Shared
passwords
(
between
services
)
are
managed
via
`
docker
secret
`
.
Before
being
able
to
start
services
,
you
need
to
define
these
secrets
.
Namely
,
you
need
to
create
a
`
secret
`
for
:
-
`
postgres
-
password
`
For
example
:
```
~/
swh
-
docker
$
echo
'
strong
password
'
|
docker
secret
create
postgres
-
password
-
[...]
```
##
Creating
the
swh
service
From
within
this
repository
,
just
type
:
```
~/
swh
-
docker
$
docker
deploy
-
c
docker
-
compose
.
yml
swh
Creating
service
swh_web
Creating
service
swh_objstorage
Creating
service
swh_storage
Creating
service
swh_nginx
Creating
service
swh_memcache
Creating
service
swh_db
-
storage
~/
swh
-
docker
$
docker
service
ls
ID
NAME
MODE
REPLICAS
IMAGE
PORTS
bkn2bmnapx7w
swh_db
-
storage
replicated
1
/
1
postgres
:
11
2
ujcw3dg8f9d
swh_memcache
replicated
1
/
1
memcached
:
latest
l52hxxl61ijj
swh_nginx
replicated
1
/
1
nginx
:
latest
*:
5080
->
80
/
tcp
3
okk2njpbopx
swh_objstorage
replicated
1
/
1
softwareheritage
/
base
:
latest
zais9ey62weu
swh_storage
replicated
1
/
1
softwareheritage
/
base
:
latest
7
sm6g5ecff19
swh_web
replicated
1
/
1
softwareheritage
/
web
:
latest
```
This
will
start
a
series
of
containers
with
:
-
an
objstorage
service
,
-
a
storage
service
using
a
postgresql
database
as
backend
,
-
a
web
app
front
end
,
-
a
memcache
for
the
web
app
,
-
an
nginx
server
serving
as
reverse
proxy
for
the
swh
-
web
instances
.
##
Updating
a
configuration
When
you
modify
a
configuration
file
exposed
to
docker
services
via
the
`
docker
config
`
system
,
you
need
to
destroy
the
old
config
before
being
able
to
recreate
them
(
docker
is
currently
not
capable
of
updating
an
existing
config
.)
Unfortunately
that
also
means
you
need
to
recreate
every
docker
container
using
this
config
.
For
example
,
if
you
edit
the
file
`
conf
/
storage
.
yml
`
:
```
~/
swh
-
docker
$
docker
service
rm
swh_storage
swh_storage
~/
swh
-
docker
$
docker
config
rm
swh_storage
swh_storage
~/
swh
-
docker
$
docker
deploy
-
c
docker
-
compose
.
yml
swh
Creating
config
swh_storage
Creating
service
swh_storage
Updating
service
swh_nginx
(
id
:
l52hxxl61ijjxnj9wg6ddpaef
)
Updating
service
swh_memcache
(
id
:
2
ujcw3dg8f9dm4r6qmgy0sb1e
)
Updating
service
swh_db
-
storage
(
id
:
bkn2bmnapx7wgvwxepume71k1
)
Updating
service
swh_web
(
id
:
7
sm6g5ecff1979t0jd3dmsvwz
)
Updating
service
swh_objstorage
(
id
:
3
okk2njpbopxso3n3w44ydyf9
)
```
##
Updating
a
service
When
a
new
version
of
the
softwareheritage
/
base
image
is
published
,
running
services
must
updated
to
use
it
.
In
order
to
prevent
inconsistency
caveats
due
to
dependency
in
deployed
versions
,
we
recommend
that
you
shut
the
tail
services
off
(
especially
the
replayer
services
in
case
of
a
mirror
stack
).
This
can
be
done
as
follow
:
```
docker
service
update
--
image
\
$(
docker
inspect
-
f
'
{{
index
.
RepoDigests
0
}}
'
\
softwareheritage
/
base
:
latest
)
\
swh_graph
-
replayer
-
origin
```
#
Set
up
a
mirror
A
Software
Heritage
mirror
consists
in
base
Software
Heritage
services
,
as
described
above
without
any
worker
related
to
web
scraping
nor
source
code
repository
loading
.
Instead
,
filling
local
storage
and
objstorage
is
the
responsibility
of
kafka
based
`
replayer
`
services
:
-
the
`
graph
replayer
`
which
is
in
charge
of
filling
the
storage
(
aka
the
graph
),
and
-
the
`
content
replayer
`
which
is
in
charge
of
filling
the
object
storage
.
Ensure
configuration
files
are
properly
set
in
`
conf
/
graph
-
replayer
.
yml
`
and
`
conf
/
content
-
replayer
.
yml
`
,
then
you
can
start
these
services
with
:
```
~/
swh
-
docker
$
docker
deploy
-
c
docker
-
compose
.
yml
,
docker
-
compose
-
mirror
.
yml
swh
[...]
```
You
can
check
everything
is
running
with
:
```
~/
swh
-
docker
$
docker
ls
ID
NAME
MODE
REPLICAS
IMAGE
PORTS
88
djaq3jezjm
swh_db
-
storage
replicated
1
/
1
postgres
:
11
m66q36jb00xm
swh_grafana
replicated
1
/
1
grafana
/
grafana
:
latest
qfsxngh4s2sv
swh_content
-
replayer
replicated
1
/
1
softwareheritage
/
base
:
latest
qcl0n3ngr2uv
swh_graph
-
replayer
-
content
replicated
2
/
2
softwareheritage
/
base
:
latest
f1hop14w6b9h
swh_graph
-
replayer
-
directory
replicated
4
/
4
softwareheritage
/
base
:
latest
dcpvbf7h4fja
swh_graph
-
replayer
-
origin
replicated
2
/
2
softwareheritage
/
base
:
latest
1
njy5iuugmk2
swh_graph
-
replayer
-
release
replicated
2
/
2
softwareheritage
/
base
:
latest
cbe600nl9bdb
swh_graph
-
replayer
-
revision
replicated
4
/
4
softwareheritage
/
base
:
latest
5
hroiithan6c
swh_graph
-
replayer
-
snapshot
replicated
2
/
2
softwareheritage
/
base
:
latest
zn8dzsron3y7
swh_memcache
replicated
1
/
1
memcached
:
latest
wfbvf3yk6t41
swh_nginx
replicated
1
/
1
nginx
:
latest
*:
5081
->
5081
/
tcp
thtev7o0n6th
swh_objstorage
replicated
1
/
1
softwareheritage
/
base
:
latest
ysgdoqshgd2k
swh_prometheus
replicated
1
/
1
prom
/
prometheus
:
latest
u2mjjl91aebz
swh_prometheus
-
statsd
-
exporter
replicated
1
/
1
prom
/
statsd
-
exporter
:
latest
xyf2xgt465ob
swh_storage
replicated
1
/
1
softwareheritage
/
base
:
latest
su8eka2b5cbf
swh_web
replicated
1
/
1
softwareheritage
/
web
:
latest
```
If
everything
is
OK
,
you
should
have
your
mirror
filling
.
Check
docker
logs
:
```
~/
swh
-
docker
$
docker
service
logs
swh_content
-
replayer
[...]
```
and
:
```
~/
swh
-
docker
$
docker
service
logs
swh_graph
-
replayer
-
directory
[...]
```
##
Scaling
up
services
In
order
to
scale
up
a
replayer
service
,
you
can
use
the
`
docker
scale
`
command
.
For
example
:
```
~/
swh
-
docker
$
docker
service
scale
swh_graph
-
replayer
-
directory
=
4
[...]
```
will
start
4
copies
of
the
directory
replayer
service
.
Notes
:
-
One
graph
replayer
service
requires
a
steady
500
MB
to
1
GB
of
RAM
to
run
,
so
make
sure
you
have
properly
sized
machines
for
running
these
replayer
containers
,
and
to
monitor
these
.
-
The
overall
bandwidth
of
the
replayer
will
depend
heavily
on
the
`
swh_storage
`
service
,
thus
on
the
`
swh_db
-
storage
`
.
It
will
require
some
network
bandwidth
for
the
ingress
kafka
payload
(
this
can
easily
peak
to
several
hundreds
of
Mb
/
s
).
So
make
sure
you
have
a
correctly
tuned
database
and
enough
network
bw
.
-
Biggest
topics
are
the
directory
,
content
and
revision
.
File Metadata
Details
Attached
Mime Type
text/plain
Expires
Jun 4 2025, 7:10 PM (9 w, 4 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3245679
Attached To
rCDFP Deployment tools for hosting a mirror
Event Timeline
Log In to Comment