sam in Trieste
Log of usage of SAM in Trieste
Configuration
One SAM station installed "samtrieste-1" on node pccdf2.ts.infn.it
(Dell 1400 server, dual P3-800, 2GB RAM) as per
instructions
in June 2002 with 20GB of disk cache
on local SCSI (/sam/cache1/boo aka ~sam/cache1/boo). sam user is local only.
sam products installed on AFS via cdfsoft shared installation.
No root intervention was needed other then creating sam user and
/sam file system and for making the sam station to start automatically
at boot time as per
instructions
Problems and fixes
- test_run_sam_submit script is not put on disk by addPkg
- got it by cd SamInput/test/SamInputTest; cvs co -r 1.1 test_run_sam_submit
- job-script.sh is created without X permissin and sam submit fails with
"interpreter error"
- fixed by adding chmod +x job-script.sh at the bottom of run_sam_submit
just before the echo command
- sam startup script would not work since my sam user has bash as
default shell, also I do nothave /home/cdfsoft2 area
- modified suggested sam_bootsrap script fom
instructions changing
su -sam -c "source home/cdfsoft/cdf2.csh; ..." to
su -sam -s /bin/tcsh -c "source /afs/infn.it/project/cdf/cdfsoft/cdf2.cshrc; ... "
NOTE: this has now been fixed in instruction by R.St.Denis
- SAM input module (which is the same a FileInput and as DHInput_2,
instead of the older DHInput, as I learnt from Robert Kennedy, also note
that SAM/File/DH_2 input module handle multibranch root, while DHInput
does not) is very slow at processing begin of run records
- that's it,
experts
will try to speed it up. Reasons is
known
- SAM thiks the cache is only 2GB instead of 20 (see
my e-mail)
- traced to install glitch,
fixed
- difficulty with
web diagnostic
- problem with sam dump station --all
- How do I tell SAM to process only one part of a dataset ?
-
Successes
- Jun2-02: ran successfully all examples in install instructions
- 12-aug/02: ran sam job that access 1K events from hbot1e dataset. Multiple
files are spooled from enstore to Trieste.
- 12-aug-02: created
hbot1e-run-145045
data set using
SAM dataset editor
that restricts hbot1e dataset to that single run (4 files) and ran
sam job on it
- 5-apr-02: updated cdf-trieste sam station wtih Rick St.Denis help to
copy configuration as in cdf-glasgow. List of present sam products verions is
here. Note: on Apr 8 I declared current sam_config v4_2_17 instead of v4_2_11 to align with cdf-am station on fcdfdata016.fnal.gov. I left sam_user and sam_common at higher version number though, just in case those are needed to test a configuration different then cdf-sam.
- 5-apr-02: in new configurations used latest version of sam helper from
within svtsim AC++ module to process in Trieste one raw data file from SVT 4/5
special run 156456. See unixts.ts.infn.it:~belforte/svt/491/... esp. setup_for_sam.sh.
The dataset used was created with the SAM data set editor and is named
svt-test-run-1file
- mar/apr-03: with help from Rick St.Denis, while doing shift at fnal, I successfully upgraded cdf-trieste sam station to latest/current sw, and integrated latest sam input into my usual svtsim AC++ job managing to run in Trieste on raw data files from enstore. Here are
instructions (in italian!).
- apr-03. Added nas:/cdf3 (196GB) as common nfs mounted sam cache disk to
the sam station, also nfs mount the previous one. now cdf-sam has two cache disks (20+200GB) both nfs mounted on all cdf desktops
- oct-03: manage to make also a sam station at cnaf run on wn-06-24-a
using same /afs installation of products as cdf-trieste. This sam station is called sam-cnaf
- oct-03: all of a sudded things stop working. It turns out that at fnal they
disabled the possibility to use fcdfdata016 (aka cdf-sam) as intermediate cache for enstore, we have to go to dcache directely. So I needed to upgrade all
product versions and a few more tricks (impossible to solve easily due to
lack of documentation, Rick+Stefan had to log in and fix it).
Then I installed gridFtp since otherwise dCache limits us to using single
stream ftp on the weak authenticated port.
- See needed actions in here
- nov-03: made gridftp work from FZKA dn FNAL. FZKA was blocked by firewall
there, Ulrich got it opened.
Needed a lot of hand-tests to figuire out problems fast, a series
of tips for dubugging sam cp commands is in
this file
Stefano Belforte
Last modified: Sat Nov 22 00:50:53 CET 2003