Knowledge Base: ADMINISTRATION : LAN Storage Discovery Overview
 
LAN Storage Discovery Overview
Creation Date: August 18, 2006
Revision Date: September 08, 2014
Product: DS‑Client
Summary
For step-by-step instructions, see “Using the LAN Storage Discovery Tool”.
It is very useful to analyze the LAN storage before using any backup application. The most important information is the redundancy (duplication) factor, and the frequency of data changes. The main purpose of the LAN Storage Discovery Tool is to provide customers with a clear view of information about their LAN by scanning and analyzing the LAN files.
LAN Storage Discovery is a tool integrated in DS-Client, designed to analyze the LAN storage.
To analyze the LAN, you must first discover the shares on the LAN (or a particular part of the LAN) using a list of credentials you provide. Then, the share list is arranged, defining which shares to scan and how they will be scanned. Finally, the scan process is started for the files on the LAN.
Tools to manage and monitor the discovery and scan processes are also provided.
The scan process runs on DS-Client and can be scheduled. Several reports are generated to analyze the scanned storage: Duplication, growth trends, access reports, file type distribution, etc. Summary and detailed reports are provided, as well as an export feature that you can customize.
LAN Storage Discovery Tool
The following sections describe the features of the LAN Storage Discovery Tool.
Discovery
Configure the range (extent) of your LAN that will be analyzed. This range is a collection of items. Each item can be a network provider, a domain, or a computer. A list of credentials can be provided for the discovery / scanning process. For Unix DS-Clients, an additional level of authentication for credentials is available.
The discovery process will automatically analyze the overlapping of shares, if the shared path can be retrieved. The overlapped shares will be disabled for scanning by default to avoid false duplication counting.
LAN Storage Discovery (Windows DS-Client) supports the following network providers:
Microsoft Windows Network
NetWare Services
NFS
LAN Storage Discovery (UNIX DS-Client) can discover shares on the network through:
NAS
NFS
SSH
Scan process
The scan process is applied to the list of shares found by the discovery process.
The discovery process automatically disables some shares in the following scenarios:
If the share is covered by other shares;
If not enough credentials were provided to retrieve all necessary information from the share
If the share is not supposed to be scanned (e.g. CD/DVD, etc.)
NOTE:  The status (disabled / enabled) can be manually changed from the GUI.
By default, the credentials used to discover the share are also set as the scan credentials for that share by the discovery process. However, the credentials used for scanning can be changed for particular shares from the GUI.
The scanning of a share to follow re-parse point (Microsoft Windows only) is configurable from the GUI.
To speed up the scanning process and reporting, the scanning process can be configured from the GUI to skip files that are smaller than a specific size.
The number of scanning threads is also configurable from the GUI (to provide scalability).
The scan process can be scheduled, or started on demand. The best practice is to schedule it to run over a period (i.e. a few weeks) to get more accurate statistical information on data growth and changes.
A “Scan Monitor” is available to give a real-time overview of the scan process.
Reports
The reports can be divided into the following three categories.
One-click generated reports
The One-click generated reports are the most important ones. Windows DS-Client has 12 reports and UNIX DS-Client has 8 reports. The generated reports will be saved as files in HTML and Microsoft Excel format. Many of them can be customized. The reports included are:
 
 
Report
Description
Share usage
Reports the total files and size (all and duplicated) for each share.
Largest Files
List top n (default is 100) largest files. Also report the percentage size compared to the entire LAN.
Largest Duplicates
List top n (default is 100) largest duplicate files. Also report the total duplicate file number and the percentage of this number as well as the total size of the top n files comparing to the entire LAN.
Ownership
(Windows DS-Client only)
Reports the total storage occupied by each owner for each share as well as the total.
File Type Distribution
Select top n (default is 10) file types, sorted either by space or file number (default is by space). For each file type, select top m (default is 100) largest files from the entire LAN.
Partition Size
(Windows DS-Client only)
For each partition, report the total, free, and used space.
Access Report / Dormant Files
For each share, and each interval (These intervals are configurable. By default, 5 intervals are provided: <1 Day, >1 Day, >1 Week, >1 Month, >1 Year), report the top n (default is 100) largest files whose last access time is within the specified interval.
There is also a summary report of the total size and files for each interval of each share.
Growth and Modified Files
Reports how many files and size created/modified for each share, each day over the specified period (default is 14 days).
SQL Server Size
(Windows DS-Client only)
For each computer, how much space (total, free, used) are for SQL Server.
Exchange Server Size
(Windows DS-Client only)
Report all .edb and .pst files.
Index of Generated Reports
An index of all generated reports.
All Duplicate Files
A list of all duplicate files. By default this report will not be generated.
LAN File Summary
This provides an overview, as well as detailed (up to file level) information of the scanned files on the LAN. Complicated filters are also available so that it is possible to analyze a particular group of files.
 
 
Report
Description
Global view
An overview of the total files and size for the following categories:
All files
Duplicate files
Duplicate groups
Changed and new files
Unchanged files
Not accessed files
The changed / unchanged, new, not accessed are determined by the parameter ‘last n days’. The default is last 30 days.
Filtered view
This is similar to the global view. However, it is for each share based on the specified condition. This condition is the combination of the following:
‘Last n days’: to determine changed/unchanged, new, not accessed.
Filter: like ‘*.c*’ on file name
Duplication: Only the files with duplicates equal to or greater than the specified number will be counted.
Share: Specify a group of shares to be considered.
Show Details
Show the detailed information of the files corresponding to the selected row from the global / filtered view.
Charts
Show the pie charts for the selected row from the global / filtered view. The charts could be based on size or file numbers for all categories (Duplicated, Unchanged, etc.).
Show Duplicates
Show all files that are duplicated with the selected file.
Other reports
These reports are already included in the “One-click generated reports.” However, they can be customized in a more powerful way from DS-User > Tools Menu > LAN Storage Discovery: report_name.
 
 
Report
Description
Large Files report
List top n (default is 100) large files whose duplicate count is at least equal to the given value (default is all files) for the specified share or all shares.
Extension report
There are two kinds of reports. The “Top Report” shows the top n extensions ordered either by size or by file numbers for the specified range (a share or all shares). The “Customize Report” asks for a list of specific extensions and the range (specific shares). The total size and file numbers are also reported.
Access report
Reports the files and size for the given intervals, set within the given range (specific shares). An interval could be a number of days, weeks, months, or years.
Growth report
User provides the period for analysis: intervals (in days), range (specific shares), and type (total or new). Report shows the files and size for each interval that falls within the specified period.
Share usage report
Reports the files and size grouped by owner for the specified shares.
Unix Support Limitations (Linux DS-Clients)
Soft links and device files will be silently skipped. Hard links will be treated as regular files. If two hard links point to the same file, they will be considered as duplicates of each other.