Tecdoc queries and structure manual - version 2
MySQL installation and operation guide for TecDoc database
- Файл:En Filter Country Language 1-Functions Procedures.sql
- Файл:En Filter Country Language 2-Filter Country.sql
- Файл:En Filter Country Language 3-Filter Language.sql
- en_Filter_Country_Language info.txt
ATTENTION: It is required to optimize MySQL settings before upload of such a large database into MySQL
- in some cases it allows to reduce the upload time from several days to several hours! See section "MySQL otimization" below.
Folders and files structure
base.full/ - SQL-full database dump (for all countries and in all languages)
base.noctm/ - SQL-full database dump (for all countries and in all languages)without coloumn of parts delivery availability to different countries - after load in MySQL the size is 4 times less than base.full!
base.europe/ - SQL-database dump, filtered by a group of countries "Europe" (code 248) in all languages - after upload in MySQL the size is 4 times less than base.full!
base.russia/ - SQL-database dump, filtered by a group of countries "Former USSR countries" (code 204) and Russian languauge (code 16) - after upload in MySQL the size is 5 times less than base.full!
distributive/ - TecDoc catalog distributive (in the form of DVD images), which data was extracted from, crack works fine for Windows XP/Vista/7/2003/2008/2008 R2.
queries/ - samples of SQL-queries for MySQL database.
queries/Filter_Country_Language/ - Database downsizing script: filters data by country and/or languauge. Allows to get from base.full database your own analog of base.russia database for any country/languauge combination.
addons/count_rows - number of rows in each table of all database versions (may be used for checkup: whether all data has been loaded?). addons/models.php - PHP-sample output of all models of the brand ALFA ROMEO (based on sample SQL-query MODELS.sql)
addons/SEARCH_TREE4.xlsx - table with a tree of all spare parts categories.
addons/tecdoc_schema.png - flowchart of the main TecDoc tables.
tecdoc_tables.txt - main TecDoc tables description.
useful_tables.txt - list of tables, used in my sample SQL-queries - will allow you to not to upload unnecessary tables into the databes.
MySQL Database load procedure
1. Download necessary files
2. Check archives' consistency by MD5 checksums
3. Configure and restart MySQL
4. Create an empty database
5. Create empty tables by SQL-table generating script
6. Unpack and load tables' data from the SQL-dump
Upload of SQL-dumps into MySQL
Tables structure and the data itself can be found in different files. The data is also divided into several files - in accordance with the tables, which it is loaded into:
tecdoc.Version.BaseType.structure.mysql.StorageType.sql - SQL-scripts to create the structure of all
the tables with all indexes, necessary for rapid sample SQL-queries execution.
tecdoc.Version.BaseType.structure.mysql.StorageType.nokeys.sql - SQL-scripts to create the structure
of all the tables with a minimum set of indexes.
data/tecdoc.Version.BaseType.data.TableName.7z - Archives, containing tables data. Here StorageType - InnoDB or MyISAM (detailed description can be found below).
Do not try to upload SQL-dumps of this size through phpmyadmin - it is useless.
Use native console utility mysql (or mysql.exe for Windows) - comes with DBMS MySQL. In Windows running in command prompt, in UNIX - in SSH-console. If you don't have SSH-access to hosting, try PHP-script bigdump: http://www.ozerov.de/bigdump/ (anyway console mysql-client is still much faster)
To create empty tables, run the following command in a folder with SQL-script:
mysql -uLogin -pPassword BaseName <tecdoc.Version.BaseType.structure.mysql.StorageType.sql
To load all .sql-files with a dump from the current folder, run the following command
in this folder: For UNIX: cat *.sql | mysql -uLogin -pPassword BaseName For Windows: type *.sql | mysql -uLogin -pPassword BaseName
If you want to upload into MySQL not all but only some of the tables, you need not download
all data archives, choose only necessary ones. Remove unnecessary tables from SQL-table generation script as well, or execute all of it first and then delete unnecessary empty tables from MySQL.
You won't surely need ALL TecDoc tables. For instance to execute all of my sample SQL-queries
30 tables, listed in the file useful_tables.txt, are enough, comparing to a total of 77 TecDoc tables! Besides, if you don't use all of the sample SQL-queries, you may choose even a smaller set out of these 30, looking at their short descriptions in that same file useful_tables.txt.
If you need only crosses from TecDoc (cross links for analog spare parts search), then the
following tables will be enough:
ART_LOOKUP ARTICLES BRANDS DES_TEXTS DESIGNATIONS SUPPLIERS And the following sample SQL-queries will be also enough: ART_LOOKUP.sql ART_LOOKUP2.sql
Extracting archives ------------------------------- The database is packed by 7-Zip archiver. You can download it here:
p7zip is available both in the compiled version (for Linux x86: p7zip_*_x86_linux_bin.tar.bz2)
and source codes (p7zip_*_src_all.tar.bz2), which can be compiled for any other platform. Standard bundles of most OS (operating systems) distributives already contain a version of p7zip specifically compiled for this OS. The bundle is usually called p7zip or p7zip-full (for extracting .7z archives any of the above is enough).
In some Linux-systems instead of the name 7z it is necessary to use 7za - keep this in mind in
all the examples below.
If you want to extract with p7zip not only SQL-dump of the database, but also pictures, my advise is
to use only versions 4.58-4.65, as they unpack large number of small files considerably faster, than all the other ones (including latest beta-versions 9.xx) - up to 5 times faster! There is no difference in extraction speed between different versions of 7-Zip for Windows.
Extract from the command line this way:
7z x ArchiveName (or the name of the first part of the archive - with extension .001)
It is possible to load SQL-dumps into MySQL with 7-Zip extraction "on the fly" - my advise is to
do this way, because it is faster and requires less disk space. To upload data into MySQL from all of the downloaded archives at once in the current folder use the following command: 7z x -so "*.7z" | mysql -uLogin -pPassword BaseName
For Windows the full paths to files 7z.exe and mysql.exe should be specified, enclosing them in
"double quotes" if the paths contain spaces.
The tables are uploaded into MySQL in UTF-8. Upload of all the tables takes 3-12 hours with a minimum set of indexes and 6-24 hours with the
full set of indexes. If you choose the minimum option, then all additional indexes, necessary for fast sample SQL-queries execution, can be further added from queries/KEYS.sql file, although in this case additional indexes generation will take the same 5 hours, which you initially may save during upload of the tables.
Archive consistency check (MD5) ----------------------------------------------------- As the archives are large and will be extracted for a fairly long time (especially if data is directly
uploaded into MySQL database during extraction) not to run into a mistake somewhere in the middle of the process, I advise to check downloaded files for consistency using the file dir.md5 first. It contains MD5 checksums for all archives.
MD5 check can be performed by:
1. Built-in Total Commander tools (for Windows) 2. Command line utility md5sum.
md5sum for Windows can be downloaded here: http://etree.org/cgi-bin/counter.cgi/software/md5sum.exe md5sum for Linux/UNIX is a part of GNU Coreutils: http://www.gnu.org/software/coreutils/
which are initially available in all Linux/UNIX distributives.
To launch type:
md5sum.exe -c dir.md5 & pause
If it doesn't return the line WARNING after launch, then everything is OK.
MySQL optimizing ----------------------------- I recommend that you should use InnoDB storage, as it allows you to cache in memory not only indexes, but
also table data. If needed, you can upload everything into MyISAM as well, using the appropriate SQL- table structure generating script.
All of the following parameters are specified in [mysqld] file section:
my.cnf (for Linux/UNIX) my.ini (for Windows)
It is necessary to restart MySQL after change of parameters in this file.
InnoDB storage optimizing ------------------------------------------- If you upload in InnoDB storage, the main setting that affect the upload (as well as
further database operation): innodb_buffer_pool_size = 1024M It allocates 1024 Mb of shared cache for data and indexes in InnoDB including operations of indexes creation. In this case MySQL total memory usage is about 1.5 Gb RAM. If not enough memory is available, then allocate at least 512 MB for this parameter.
Using InnoDB it is also useful to add the following parameter:
innodb_file_per_table = 1
It makes MySQL save each table of InnoDB storage into a separate file with the following name pattern
"BaseName/TableName.ibd". Otherwise, all of the tables are located in a shared file "ibdata1", which only grows and never downsizes - even when tables and databses are deleted!
It is necessary to restart MySQL after settings change. If you had InnoDB tables in MySQL before, they'd remain in file "ibdata1" after the change of the parameter-
only tables created afterwards appear in separate files. If you are apt to to move the existing InnoDB tables into separate files, use the following SQL-command: ALTER TABLE BaseName.TableName ENGINE=InnoDB; for each of your existing tables. Next exit MySQL and delete the file "ibdata1". During the next start of MySQL the default empty 10 MB file "ibdata1" will be created, and all the tables will be in separate files.
MyISAM storage optimizing -------------------------------------------- If you upload into MyISAM stoarage, use the following settings: Indexes cache - may not be used for indexes generation, but is surely used for futher operations with tables:
key_buffer_size = 128M
RAM buffer for operations of indexes generation:
myisam_sort_buffer_size = 512M
It is better to set a fairly large size of the temporary file on your hard drive for index generation, or else the
indexes are generated somehow very slowly via keycache: myisam_max_sort_file_size = 10G
SSH-connection break during Linux/UNIX data upload ---------------------------------------------------------------------------------- There are 2 ways to settle the issue:
1. You may run SSH commands in the background, adding at the end: >/dev/null 2>&1 & E.g.: 7za x "*.7z" >/dev/null 2>&1 & Then you may separate this background process from the current SSH session with the command: disown %1 In detail Linux console background processes management is described here: http://clinuxworld.com/ubuntu/36-administration/278-running-processes-in-the-background-and-foreground
2. There is such a console utility in Linux: screen It creates its own virtual session, separate from the current SSH-connection session. It also allows to create multiple virtual sessions and switch between them as between windows. Moreover, if the SSH-session breaks, all of your usual processes continue running in screen. Under next SSH-connection you may go back to any of the previously opened sessions and control how they are running, interrupt processes etc. the main thing is not to forget about your open screen-sessions and close them, when they are no longer needed, with the command 'exit'. If 'screen' is not installed in your OS, in all OS distributives it can be installed from a standard set of 'packages' or 'ports'. This utility is described in detail here: http://help.ubuntu.ru/wiki/screen
Choosing hosting ---------------- It is recommended to use either dedicated server or shared dedicated server (VPS) for the database,as
it loads up a hosting heavily during data upload and update, and requires additional MySQL configuration as well.
The most crucial is the amount of RAM.
Recommended: 2 GB RAM, 1.5 GB of which allocated for MySQL. Minimum required: 2 GB RAM, 700 MB for MySQL. It will run, but it will take a long time to upload/update data.
You may also use the database remotely from my MySQL server - in this case the process of initial upload and further updates is simplified, and the requirements for your web hosting are reduced: in most cases cheap shared 1-5 GB hosting will be enough for the site.
Sample SQL-queries ----------------------- If it is not clear, how any of the sample SQL-queries works, add the word EXPLAIN before it, e.g.:
EXPLAIN SELECT * FROM LANGUAGES; It often helps to understand the structure of a complex query: how tables are related to each other and exactly how MySQL processes the query. It is usually clear from query decryption, which additional indexes can speed up the query, if it is executed slowly.
You can find samples of 22 popular queries to the received MySQL database with detailed comments in 'queries' folder:
ART_LOOKUP.sql - Product search (from any manufacturer) by specified original/non-original commercial number (article)
It is used to specify the manufacturer, for further analog search by a specific combination of: Article+Manufacturer
ART_LOOKUP2.sql - Analog search (original/non-original/commercial) by any specified article (original/ non-original/commercial) ARTICLES.sql - Displays detailed information for a given non-original product (ART_ID):
Product Overview Criteria (features) Additional product information Analog articles (original/non-original/commercial) File paths for pictures File paths for vendors logos PDF-files paths Applicability to vehicles Recommended prices
ARTICLES_SEARCH.sql - Search of non-original products by category with the specified text in name, suitable for a specified car (TYP_ID)
COUNTRIES.sql - Displays a list of all TecDoc countries and groups of countries
MANUFACTURERS.sql - Displays all car brands
MODELS.sql - Displays a list of models for a specified make of car (MFA_ID)
SEARCH_TREE.sql - Builds the product category tree for a specified make of car (TYP_ID)
SEARCH_TREE2.sql - Displays a list of non-original products for a specified make of car (TYP_ID) and category (STR_ID)
SEARCH_TREE3.sql - Displays a list of non-original products for a specified category (STR_ID)
SEARCH_TREE4.sql - Displays the full category tree for all spare parts
SEARCH_TREE4.xlsx - A table with the result of the same name query for the current version of TecDoc
SUPPLIER_COUNTRIES.sql - Displays the country of the specified non-original manufacturer (SUP_ID)
TYPE_NUMBERS.sql - Displays a list of makes of cars for a specified KBA-number (used in Germany)
TYPES.sql - Displays a list of makes of cars for a specified model(MOD_ID)
There is a LIMIT 100 restriction in some of the samples to display only the first 100 found matches. If you want to see all the matches - remove this restriction. These samples are more often used in chain, placing the values, obtained in each previous step, into initial data of the next one:
1. Select via the tree of cars and parts categories:
MANUFACTURERS.sql -> MODELS.sql -> TYPES.sql -> SEARCH_TREE.sql (несколько раз) -> SEARCH_TREE2.sql -> ARTICLES.sql
2. Direct article search (including search by the analog number):
ART_LOOKUP.sql -> ART_LOOKUP2.sql -> ARTICLES.sql
FUNCTIONS.sql - Additional MySQL-functions and examples of their use
KEYS.sql - Adding supplementary indexes to tables for sample queries speeding-up. It is only necessary, if you initially uploaded tables with minimum set of indexes.
- Prices and packages for Irbis software based on tecdoc from Mstarproject company: http://www.mstarproject.com/en/?action=tecdoc_mysql_site
- All text information in one archive: http://manual.mstarproject.com/images/en_tecdoc_autobase_original.zip