FileLinkCopy scheduled job

Release 1.4 - ...

The FileLinkCopy job is designed to copy the binary data of files to the database, using the binary data field of the corresponding items. It is specifically (but not limited to) used in conjunction with Faceted Search, in which case the binary data is used to have SQL Server index its contents.

Although this job already existed in earlier releases, this article describes the implementation as it is within release 1.4 (build 2).

The basics

As mentioned, the FileLinkCopy job has one specific task: compare items within the database of a specific contenttype (FLK by default) and update the items with the binary data of the corresponding file when needed. To do so, it queries the database for items with the specified contenttype and collects the Url field (which contains the relative url for the corresponding file) and a column which contains the lastWriteTime.

The column containing the lastWriteTime is configured using the CTFP FileLinkLastWriteField on the Url field of the specified contenttype. Usually, this is the CTSpecificDate1 field.

It then, for each item, compares the lastWriteTime as registered within the database against the lastWriteTime of the file which the Url field refers to. If these two do not match, the binary data field of the item will be updated with the binary content of the file and the lastWriteTime column will be updated with the lastWriteTime of the corresponding file.

Which binary database field should be used is configured using the CTFP FileLinkPhysicalField (on the Url field of the specified contenttype as well). Usually, this is the CTSpecificBinary1 field.

Parameters

The FileLinkCopy job has the following (optional) parameters:

Name Description
contenttype The code of the contenttype to be used. Notice that the contenttype should be part of the Reference contenttype group. 
Defaults to FLK (FileLink).
maxfilesize The maximum file size (in bytes). Files larger then this value will not be copied to the database.
Defaults to Int32.MaxValue (2.147.483.647).

Implementation details

When this job is started, it first queries the database for items of the specified contenttype (FLK by default) using the vwActive view, collecting the Url field and the lastWriteTime column (usually CTSpecificDate1). These items are collected within a list.

The job then iterates through this list and performs the following steps:

  • Check if the file as specified within the Url field exist. If not, the item is skipped.
    When logging has been enabled (category System, level 2), you will see a log message like: FileLinkCopyJob: Skipping item '<item number>' because the file '<filename>' does not exist.
  • Compare the LastWriteTime property of the corresponding file with the lastWriteTime as registered in the database. When these two match, the item is skipped.
    Log message (level 2): FileLinkCopyJob: Skipping item '<item number>' because the file '<filename>' has not been modified since '<datetime>'.
  • Check the file size. When it exceeds the maxfilesize parameter, the item is skipped.
    Log message (level 2): FileLinkCopyJob: Skipping item '<item number>' because the file '<filename>' is too big (<filesize> bytes).
  • When all checks have passed, the binary data field of the item will be updated with the binary data of the corresponding file and the lastWriteTime column will be updated with the lastWriteTime property of the file.
    If this fails for whatever reason, a log message will be emitted: FileLinkCopyJob: Skipping item '<item number>' because the database update failed. Exception: <exception message>