This page describes the ins and outs of the ImportXmlFile
Goal: migrate networks between different issuecrawler versions. When the xml file has been imported, the network can be found in the archive under the date that the crawl started.
How to use
- Don't worry, both scripts are only allowed for administrators.
How does the script work?
- First all info from the network is read and inserted into the database (i.e. title, date, author, starting_points, iterations, depth etc - basically everything that should be inserted into im_network).
- Second all info from the internal pages and sites is read
- First all sites are inserted
- Then the pages belonging to that site
- Third, after all internal sites and pages are inserted the link information is retrieved, im_link is inserted.
- Fourth is the second and third step again but then for external sites and pages
- As a fifth step the knowledge for each site is calculated and updated accordingly.
How do things get inserted?
The output file will be imported into the database with the following settings
- schedule_index = 0
- Schedule info is lost as this is not retrievable from the xml file. Any schedules from this crawl will have to be made after the xml file has been imported into the database.
- output_file_dirty = y
- crawl_status = 2 (completed)
- status = 1 (network_status = private)
- The crawl_queued paramater will be the same as crawl_start as only start and enddate can be retrieved from the xml file
The script looks for an author in the database with the emailaddress and full name found in the xml file. If it finds such a user the network is assigned to him/her. If no user is found the network will belong to Richard Rogers. When the script is finished an Administrator has the possibility to change the user manually.
for the rest
all parameters are read from the xml file
What if it doesn't work
- If you get returned to the page without any notification it is most likely that the xml file is too big. Ask the system administrators to increase the upload_max_filesize in php.ini (currently set to 5Mb).