Artifactory comes with a built-in Derby database that can be reliably used to store data for production-level repositories (up to hundreds of gigabytes).
Artifactory's storage layer supports pluggable storage implementations (made possible by the underlying Jackrabbit JCR), so you can configure Artifactory to run with almost any JDBC database or even store data completely on the file system.
Once-and-Only-Once Identical Content Storage
Artifactory stores identical binary files only once. When a new file about to be stored in Artifactory is found to have the same checksum as an already stored file, Artifactory will not store the new file content, but will make a link to it in the metadata of the newly deployed file. This principal applies regardless of under which repository and path artifacts are deployed - you can deploy the same file to many different coordinates, and as long as an identical content was found in the storage it will be reused.
Changing the Storage Type Used
The general principal for changing the storage used by Artifactory is to edit the $ARTIFACTORY_HOME/etc/artifactory.system.properties file:
The path used can be either a relative path to $ARTIFACTORY_HOME/etc or an absolute path, and is expected to contain a repo.xml file (which is a Jackrabbit configuration file).
For a JDBC database you will also need to:
Download the appropriate JDBC driver and install it in your server's shared lib directory.
Create a database instance to which Artifactory will connect (when using an out-of-process database). Database tables will be auto-created.
Change the database details in the repo.xml file to match your database.
Backing up your existing installation When changing the storage type for an exiting installation you will need to import the old Artifactory content and configuration from backup. Make sure to back up your older Artifactory system before using a different storage type.
Removing the old $ARTIFACTORY_HOME/data folder If you already run Artifactory before with a different storage type you will need to remove (or move-aside) the existing $ARTIFACTORY_HOME/data folder, or Artifactory will still use part of the old storage definitions and will fail to start up (you will see record not found exceptions on startup). Starting with a clean or no $ARTIFACTORY_HOME/data folder will fix this.
Changes to repo.xml Except for updating the database details in repo.xml, do not make any other changes to the repo.xml file or try to manually replace it with a repo.xml from a newer Artifactory version.
All changes to the repo.xml file as part of an Artifactory upgrade are always applied automatically.
When Artifactory is Deployed as a WAR
If you deployed Artifactory as a WAR and have not specified a location for the $ARTIFACTIORY_HOME directory, it will be auto-created by Artifactory under '$user.home/.artifactory'.
To use the bundled configuration files for common storage types, you may want to copy the etc/repo folder from the Artifactory distribution to your $ARTIFACTORY_HOME/etc. Then edit the $ARTIFACTORY_HOME/etc/artifactory.system.properties file as described above to point at the desired configuration.
The Bundled Storage Configurations
Out-of-the-box Artifactory comes with built-in configurations (repo.xml files) for the several storage types, as listed below.
Database Storage Types
The following configurations use a JDBC database for storage, and manage binaries as blobs with file system blob caching (1Gb by default)
The following configurations store all binaries as files (in $ARTIFACTORY_HOME/data/filestore), and use a JDBC database for repository metadata management (managing repository metadata in plain files, is also possible but not recommended).
This setting typcially yields the best performance with large repositories.
For raw Artifactory data backup, the folder $ARTIFACTORY_HOME/data/filestore needs to be backed up in parallel of a DB dump since both are needed. This does not impact Artifactory's own backup system which is storage-agnostic.
Accessing a Remote Database
In order to avoid network latency issues when reading and writing artifacts data, it is highly recommended to create the database on the same machine on which Artifactory will be running or on a fast SAN disk.
This is critical if the files are served from database blobs and the file system cache size is low.
Concerning the database storage (we have a rather big derby DB 56 000+ artifacts)
I had a few questions of implementation :
i - We are currently facing memory allocation issues, apparently related to the jvm32b garbage collecting/memory allocation limit of 4Gb (2.6G in fact). One way out was to turn to a 64b jvm (an environment which we dont yet have) to enable greater memory allocation.
With the default derby DB, is there a way to separate into 2 different JVMs the artifactory itself, from the DB process? would it make sense? Instead of being limited to 4Gb for one big process, i'd have 2 processes limited at 4Gb (4gb for artifactory and 4gb for derby).
ii - following the previous question, could it be possible to have multiple derby instances (one per artifactory repo for example) to break down the derby process size. The "Once-and-Only-Once Identical Content Storage" chapter here above kind of gives me an idea on the answer...
Me TOO. I thought it was an obvious thing to want to do, to move the data (repository) to another location. I'm scanning menus, looking in .properties-s and .xml-s files. With little satisfaction.
In contrast, moving my Maven repository too me a whole of 5 minutes.
May be a cross-reference topic index in in order?
If you have found the answer, please put a note here. ;-) Thanks in advance.
6 Comments
comments.show.hideJan 11, 2010
michel morizot
Concerning the database storage (we have a rather big derby DB 56 000+ artifacts)
I had a few questions of implementation :
i - We are currently facing memory allocation issues, apparently related to the jvm32b garbage collecting/memory allocation limit of 4Gb (2.6G in fact). One way out was to turn to a 64b jvm (an environment which we dont yet have) to enable greater memory allocation.
With the default derby DB, is there a way to separate into 2 different JVMs the artifactory itself, from the DB process? would it make sense? Instead of being limited to 4Gb for one big process, i'd have 2 processes limited at 4Gb (4gb for artifactory and 4gb for derby).
ii - following the previous question, could it be possible to have multiple derby instances (one per artifactory repo for example) to break down the derby process size. The "Once-and-Only-Once Identical Content Storage" chapter here above kind of gives me an idea on the answer...
Jan 13, 2010
Frederic Simon
We have bigger repositories (in term of number of Artifacts) that runs perfectly with the following configuration:
I highly suggest this configuration for your needs.
Jan 28, 2010
michel morizot
Indeed, migrating from Derby to the mysql/Filesystem configuration definitly did the trick.
ThnX!!
May 09, 2011
Graham Zabel
Hi, our C: drive is full so we'd like to move our default derby db from C: to D:. Is there any easy way to do this?
thanks.
Aug 27, 2011
Will
Hey Graham.
Me TOO. I thought it was an obvious thing to want to do, to move the data (repository) to another location. I'm scanning menus, looking in .properties-s and .xml-s files. With little satisfaction.
In contrast, moving my Maven repository too me a whole of 5 minutes.
May be a cross-reference topic index in in order?
If you have found the answer, please put a note here. ;-) Thanks in advance.
Best regards,
Will
Feb 08, 2012
Felix Herzog
I would guess this is: shut down artifactory, copy data-Folder to D and put a symlink (softlink) for "data" to the old place. Start.
Well, test it before on a test environment.