Reuse Files Across Multiple Simulations#
Running multiple simulations often involves reusing large files, such as the bathymetry of a coastal area or object geometries for computational fluid dynamics (CFD).
Instead of uploading these large files repeatedly for every simulation, Inductiva now offers a way to upload files once and reuse them across multiple tasks. This not only saves time but also reduces costs.
Additionally, you can use the outputs of one task as inputs for another, enabling easy checkpointing and reducing unnecessary data transfers.
In this tutorial, we’ll guide you through:
We’ve also included an FAQ section to address common questions and help you get started quickly.
Upload Input Files to Remote Storage#
You can upload files from either a local directory or a remote URL.
From a Local Directory#
Use the upload
method to upload files or directories from your local
system. Set the local_path
parameter to the file or directory path you
want to upload:
inductiva.storage.upload(local_path="gromacs-input-example/",
remote_dir="my_remote_directory",
)
From a Remote URL#
Use the upload_from_url
method to upload files directly from a remote
location. Set the url
parameter to the remote file’s location:
inductiva.storage.upload_from_url(url="https://storage.googleapis.com/inductiva-api-demo-files/test_assets/files.zip",
remote_dir="my_remote_directory",
)
The remote_dir
parameter specifies where the files will be stored remotely.
Use the Uploaded Files in Simulations#
Once your files are uploaded, you can reference them in your simulations using
the remote_assets
parameter in the simulator.run
method.
Example with Gromacs Simulator
inductiva.storage.upload_from_url(url="https://storage.googleapis.com/inductiva-api-demo-files/test_assets/files.zip",
remote_dir="my_remote_directory",
)
Key Details:
The
remote_assets
parameter specifies the remote storage location where the input files are stored. This must match one of the directories you set asremote_dir
in previous step.The
input_dir
parameter can still be used for local files. If noremote_assets
are provided, the input files will be read from the localinput_dir
.If both
remote_assets
andinput_dir
are provided, and files with the same name exist in both locations, the files frominput_dir
will take priority.You only need to provide one of these parameters (
remote_assets
orinput_dir
).
Use Multiple Remote Inputs#
The remote_assets
parameter accepts a list, allowing you to specify
multiple remote files or directories:
import inductiva
machine_group = inductiva.resources.MachineGroup("c2-standard-4")
machine_group.start()
commands = [
"gmx solvate -cs tip4p -box 2.3 -o conf.gro -p topol.top",
("gmx grompp -f energy_minimization.mdp -o min.tpr -pp min.top -po min.mdp "
"-c conf.gro -p topol.top"),
"gmx mdrun -s min.tpr -o min.trr -c min.gro -e min.edr -g min.log",
("gmx grompp -f positions_decorrelation.mdp -o decorr.tpr -pp decorr.top "
"-po decorr.mdp -c min.gro"),
("gmx mdrun -s decorr.tpr -o decorr.trr -x -c decorr.gro -e decorr.edr "
"-g decorr.log"),
("gmx grompp -f simulation.mdp -o eql.tpr -pp eql.top -po eql.mdp "
"-c decorr.gro"),
("gmx mdrun -s eql.tpr -o eql.trr -x trajectory.xtc -c eql.gro -e eql.edr "
"-g eql.log"),
]
gromacs = inductiva.simulators.GROMACS()
task = gromacs.run(
input_dir=None,
commands=commands,
on=machine_group,
remote_assets=["gromacs_bucket/file1.txt", "gromacs_bucket/file2.txt"])
Maintain and Manage Remote Files#
You can list, clean, or manage your remote files directly through the Inductiva API or CLI.
List Remote Files#
In CLI:
inductiva storage ls
With Python:
inductiva.storage.listdir()
Remove Files or Directories#
Remove an entire directory:
inductiva.storage.remove_workspace(remote_dir="gromacs_bucket")
Remove a single file from a remote directory:
inductiva.storage.remove_workspace(remote_dir="gromacs_bucket/file1.txt")
Reuse Task Outputs in Simulations#
To reuse task outputs, simply include the task’s storage_path
in the remote_assets parameter.
previous_task = inductiva.tasks.Task("<task_id>")
task = gromacs.run(
input_dir=None,
commands=commands,
on=machine_group,
remote_assets=[previous_task.info.storage_path])
You can also reference multiple tasks:
task = gromacs.run(
input_dir=None,
commands=commands,
on=machine_group,
remote_assets=[previous_task_1.info.storage_path, previous_task_2.info.storage_path])
All task output files are stored in the <task_id> path. For example, if you want
to use the file topol.top
from a specific task <task_id>
, you need to update
the path in your command as follows:
commands = [
"gmx solvate -cs tip4p -box 2.3 -o conf.gro -p <task_id>/topol.top",
...
]
FAQs#
1. What file types can I upload?
Any file type can be uploaded and reused.
2. Where are my files stored in the remote storage?
Your files are stored in your personal area within Inductiva’s filesystem.
3. Can I upload multiple files at once?
Yes, but only when you upload a local directory using inductiva.storage.upload
.
4. What happens if I try to upload a file to a remote directory that already contains a file with the same name?
Inductiva will show an error. You need to remove the existing file before uploading the new one.
5. Can I use files from different remote directories in a single task?
Yes, you can specify multiple remote files and directories in the remote_assets
parameter.
6. How do I confirm that my upload was successful?
You can use inductiva.storage.listdir()
in Python or inductiva storage ls
in the CLI to check the contents of your remote directory.
7. Can I upload a zip file?
Yes, you can upload zip files, but they won’t be automatically unzipped. We’re working on adding this feature in the future.
8. I need to change one file in a directory I uploaded. Do I need to re-upload the entire directory?
*No, you don’t have to re-upload everything. Simply remove the specific file using inductiva.storage.remove_workspace()
and upload the updated version. Alternatively, you can overwrite the file by uploading it via input_dir
. Refer to question 10.
9. Can I track the progress of my file upload?
Yes. If you’re uploading from your local system, a progress bar will appear. For remote uploads, you can use inductiva.storage.listdir()
or inductiva storage ls
to check progress.
10. What happens if files in input_dir
and remote_assets
have the same name?
Files in input_dir
will take priority and overwrite files with the same name in remote_assets
. This ensures that locally provided files always override remote files.
Example:
task = gromacs.run(
input_dir="local_folder/",
commands=commands,
on=machine,
remote_assets=["remote_folder/"]
)
If object.obj
exists in both local_folder
and remote_folder
, the simulation will use the file from local_folder.
11. What happens if I remove or update a remote directory while it’s being used by a task?
Once the task starts, the remote assets are copied to the task. Any changes won’t affect the ongoing simulation.
12. Does using remote assets improve task performance?
Yes, reusing remote assets reduces the need to upload large files repeatedly, cutting down task startup time.
13. Can I use remote files directly in remote_assets
without uploading them first?
Not yet, but we’re planning to add this feature soon.
14. Can I reuse a single file from another task output?
Yes, just add the storage_path
of the task to the remote_assets
list to reuse the output file.