Extract Data From Website: How to extract and restore website files, databases manually from Plesk backup

Steps to extract and restore website files, databases manually from Plesk backup
I. FIRST WAY:

If you have not so big dump file, for example 100-200MB, you can unzip it and open in any local Email client. Paths of the dump will be shown as attachments. Choose and save needed one then unzip it.

II. SECOND WAY:It can be done using mpack tools to work with MIME files. This packet is included into Debian:

    apt-get install mpack

For other Linux systems you can try to use RPM from ALT Linux:ftp://ftp.pbone.net/mirror/ftp.altlinux.ru/pub/distributions/ALTLinux/Sisyphus/files/i586/RPMS/mpack-1.6-alt1.i586.rpm

or compile mpack from the sources: http://ftp.andrew.cmu.edu/pub/mpack/.

Create an empty directory to extract the back up file:

     mkdir recover
     cd recover

and copy backup into it.By default Plesk backup is gzipped (if not, use cat), so run zcat to pass data to munpack to extract content of directories from the backup file:

    zcat DUMP_FILE.gz > DUMP_FILE
    cat DUMP_FILE | munpack

In result you get the set of tar and sql files that contain domains’ directories and databases. Untar the needed directory. For example if you need to restore the httpdocs folder for the DOMAIN.TLD domain:

    tar xvf DOMAIN.TLD.htdocs

NOTE: ‘munpack’ utility may not work with files greater then 2Gb and during dump extracting you may receive the error like

    cat DUMP_FILE | munpack
    DOMAIN.TLD.httpdocs (application/octet-stream)
    File size limit exceeded

In this case try the next way below.

III. THRID WAY:

First, check if the dump is compressed or not and unzip if needed:

    file testdom.com_2006.11.13_11.27
    testdom.com_2006.11.13_11.27: gzip compressed data, from Unix

    zcat testdom.com_2006.11.13_11.27 > testdom.com_dump

Dump consists from the XML path that describes what is included into the dump and the data itself. Every data pie can be found by appropriate CID (Content ID) that can be found in the XML path.

For example if the domain has hosting, all path that are included in the hosting are listed like:

<phosting cid_ftpstat=”testdom.com.ftpstat” cid_webstat=”testdom.com.webstat” cid_docroot=”testdom.com.htdocs” cid_private=”testdom.com.private”
cid_docroot_ssl=”testdom.com.shtdocs” cid_webstat_ssl=”testdom.com.webstat-ssl” cid_cgi=”testdom.com.cgi” errdocs=”true”>

If you need to extract domain’s ‘httpdocs’ you should look for value of ‘cid_docroot‘ parameter, it is ‘testdom.com.htdocs’ in our case.

Next, cut the content of ‘httpdocs’ from the whole dump using the CID you found. In order to do it you should find the string number from that our content begins and the string where it ends, like:

    egrep -an '(^--_----------)|(testdom.com.shtdocs)' ./testdom.com_dump | grep -A1 "Content-Type"
    2023:Content-Type: application/octet-stream; name="testdom.com.shtdocs"
    3806:--_----------=_1163395694117660-----------------------------------------

Increase the first line number on 2 and subtract 1 from the second line number, then run:

    head -n 3805 ./testdom.com_dump | tail +2025 > htdocs.tar

You get the tar archive of the ‘httpdocs’ directory in result.

If you need to restore the database, the behaviour is similar. You should find databases XML description for the domain you need, for example:

<database version=”4.1″ name=”mytest22″ cid=”mytest22.mysql.sql” type=”mysql”>
<db-server type=”mysql”>
<host>localhost</host>
<port>3306</port>
</db-server>
</database>

Find the database content by CID:

    egrep -an '(^--_----------)|(mytest22.mysql.sql)' ./testdom.com_dump | grep -A1 "Content-Type"
    1949:Content-Type: application/octet-stream; name="mytest22.mysql.sql"
    1975:--_----------=_1163395694117660-----------------------------------------

Increase the first line number on 2 and subtract 1 from the second line number, then run:

    head -n 1974 ./testdom.com_dump | tail +1951 > mytest22.sql

In result you get the database in SQL format.

Source: http://linuxadministrator.pro/blog/?p=436

Extract Data From Website

Friday, 17 May 2013

How to extract and restore website files, databases manually from Plesk backup

No comments:

Post a Comment