Difference between revisions of "PostgreSQL/From MySQL"
From Jon's Wiki
| (9 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
| − | The following table shows how to get around some MySQL-specific non SQL | + | The following table shows how to get around some of the weird MySQL-specific, non-standard SQL syntax. |
| + | |||
| + | ;Note | ||
| + | :After a bunch of editing here, I found this [http://en.wikibooks.org/wiki/Programming:Converting_MySQL_to_PostgreSQL "Converting MySQL to PostgreSQL" wikibook]. | ||
{| border=1 | {| border=1 | ||
| Line 5: | Line 8: | ||
| '''MySQL''' || '''PostgreSQL''' || '''Supported by both?''' | | '''MySQL''' || '''PostgreSQL''' || '''Supported by both?''' | ||
|- | |- | ||
| − | | LIMIT | + | | LIMIT ''offset'', ''count'' || LIMIT ''count'' OFFSET ''offset'' || Yes |
| + | |- | ||
| + | | MATCH (''column(s)'') AGAINST ''query'' || ''ftsindex'' @@ ''query''::tsquery || No<sup>(1)</sup> | ||
| + | |- | ||
| + | | IFNULL(''expr1'', ''expr2'') || CASE WHEN ''expr1'' IS NULL THEN ''expr2'' ELSE ''expr1'' END || Yes | ||
| + | |- | ||
| + | | REPLACE INTO ''table'' (''pkcolumn'', ''column2'') <br /> VALUES ('key', 'value2') || INSERT INTO ''table'' (''pkcolumn'', ''column2'') <br /> SELECT 'key', 'value2' FROM ''table'' <br /> WHERE 'key' NOT IN (SELECT ''pkcolumn'' FROM ''table'') LIMIT 1 | ||
| + | || No | ||
|- | |- | ||
| − | | BOOLEAN <br /> INT <br /> INT(''n'') <br /> INTEGER(''n'') <br /> TINYINT <br /> TINYINT(''n'') <br /> TINYINTEGER <br /> TINYINTEGER(''n'') <br /> MEDIUMINT <br /> MEDIUMINT(''n'') <br /> MEDIUMINTEGER <br /> MEDIUMINTEGER(''n'') <br /> SMALLINT <br /> SMALLINT(''n'') <br /> SMALLINTEGER <br /> SMALLINTEGER(''n'') || INTEGER || Yes<sup>( | + | | BOOLEAN <br /> INT <br /> INT(''n'') <br /> INTEGER(''n'') <br /> TINYINT <br /> TINYINT(''n'') <br /> TINYINTEGER <br /> TINYINTEGER(''n'') <br /> MEDIUMINT <br /> MEDIUMINT(''n'') <br /> MEDIUMINTEGER <br /> MEDIUMINTEGER(''n'') <br /> SMALLINT <br /> SMALLINT(''n'') <br /> SMALLINTEGER <br /> SMALLINTEGER(''n'') || INTEGER || Yes<sup>(2)</sup> |
|- | |- | ||
| BOOLEAN || INTEGER CHECK (''column'' IN ('0', '1')) || No | | BOOLEAN || INTEGER CHECK (''column'' IN ('0', '1')) || No | ||
| Line 15: | Line 25: | ||
| INTEGER SIGNED || INTEGER || Yes | | INTEGER SIGNED || INTEGER || Yes | ||
|- | |- | ||
| − | | INTEGER AUTO_INCREMENT || SERIAL PRIMARY KEY || No<sup>( | + | | INTEGER AUTO_INCREMENT || SERIAL PRIMARY KEY || No<sup>(3)</sup> |
|- | |- | ||
| − | | VARCHAR(''n'') <br /> CHAR(''n'') <br /> LONGTEXT <br /> BLOB <br /> LONGBLOB || TEXT || Yes<sup>( | + | | VARCHAR(''n'') <br /> CHAR(''n'') <br /> LONGTEXT <br /> BLOB <br /> LONGBLOB || TEXT || Yes<sup>(4)</sup> |
|- | |- | ||
| DATETIME <br /> DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00' || TIMESTAMP <br> TIMESTAMP NOT NULL DEFAULT NOW() || No | | DATETIME <br /> DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00' || TIMESTAMP <br> TIMESTAMP NOT NULL DEFAULT NOW() || No | ||
| Line 23: | Line 33: | ||
| '0000-00-00 00:00:00' || NOW() || No | | '0000-00-00 00:00:00' || NOW() || No | ||
|- | |- | ||
| − | | CHARACTER SET UTF8 COLLATE UTF8_BIN || || No<sup>( | + | | CHARACTER SET UTF8 COLLATE UTF8_BIN || || No<sup>(5)</sup> |
|- | |- | ||
| − | | ''column'' SET('value1', 'value2') || ''column'' TEXT CHECK (''column'' IN ('value1', 'value2')) || No<sup>( | + | | ''column'' SET('value1', 'value2') || ''column'' TEXT CHECK (''column'' IN ('value1', 'value2')) || No<sup>(6)</sup> |
|- | |- | ||
| ''<columndef>'' COMMENT ''comment'' || COMMENT ON COLUMN ''column'' IS 'comment' || No | | ''<columndef>'' COMMENT ''comment'' || COMMENT ON COLUMN ''column'' IS 'comment' || No | ||
| Line 35: | Line 45: | ||
=== Notes === | === Notes === | ||
| + | # [[PostgreSQL full text searching]] is done with tsearch2, now integrated into PostgreSQL 8.3 or in contrib for earlier versions. | ||
# PostgreSQL has INT2 (SMALLINT), INT4 (INTEGER) and INT8 (BIGINT), but use INTEGER unless you want ones > 2<sup>31</sup> in which case use BIGINT. | # PostgreSQL has INT2 (SMALLINT), INT4 (INTEGER) and INT8 (BIGINT), but use INTEGER unless you want ones > 2<sup>31</sup> in which case use BIGINT. | ||
# The SERIAL type is an auto-incrementing INTEGER sequence (implicitly created at CREATE TABLE time), maintained externally from the table (akin to an index). Autoincrementing integers suck in distributed scenarios, consider a UUID or composite key instead. | # The SERIAL type is an auto-incrementing INTEGER sequence (implicitly created at CREATE TABLE time), maintained externally from the table (akin to an index). Autoincrementing integers suck in distributed scenarios, consider a UUID or composite key instead. | ||
| Line 47: | Line 58: | ||
Importing rows from a MySQL dump that uses INTEGER AUTO_INCREMENT for primary keys into a new PostgreSQL database using SERIAL primary keys is problematic. Options include: | Importing rows from a MySQL dump that uses INTEGER AUTO_INCREMENT for primary keys into a new PostgreSQL database using SERIAL primary keys is problematic. Options include: | ||
| − | # reset the PostgreSQL sequences to start from MAX(''idcolumn'') + 1. <br /> SELECT MAX(''idcolumn'') + 1 | + | # reset the PostgreSQL sequences to start from MAX(''idcolumn'') + 1. <br /> SELECT MAX(''idcolumn'') + 1 AS ''startvalue'' <br /> ALTER SEQUENCE ''sequence'' RESTART WITH ''startvalue'' |
# edit the dump to remove ''idcolumn'' values from the INSERT statements (highly tedious and probably not easily scriptable) | # edit the dump to remove ''idcolumn'' values from the INSERT statements (highly tedious and probably not easily scriptable) | ||
| + | |||
| + | == Script To Update Sequences == | ||
| + | |||
| + | A quick dirty script to update sequences to start from above the maximum value of the id field. Adjust to requirements. | ||
| + | |||
| + | <pre> | ||
| + | #!/usr/bin/python | ||
| + | import sys | ||
| + | import string | ||
| + | from pyPgSQL import PgSQL | ||
| + | |||
| + | def main(filename): | ||
| + | try: | ||
| + | conn = PgSQL.connect(database='dbname', user='username', host='localhost', password='password') | ||
| + | cur = conn.cursor() | ||
| + | except PgSQL.Error, msg: | ||
| + | print "*** PostgreSQL Connection Error: %s" % msg | ||
| + | return 1 | ||
| + | |||
| + | try: | ||
| + | f = open(filename, "r") | ||
| + | except: | ||
| + | print "*** File not found: %s" % filename | ||
| + | return 1 | ||
| + | |||
| + | for line in f: | ||
| + | if (line.find("CREATE TABLE ") != -1): | ||
| + | tablename = line.split()[2] | ||
| + | seqname = "" | ||
| + | elif (line.find(" SERIAL ") != -1): | ||
| + | columnname = line.split()[0] | ||
| + | seqname = "\"%s_%s_seq\"" % (tablename.strip('"'), columnname.strip('"')) | ||
| + | |||
| + | if seqname != "": | ||
| + | maxsql = "SELECT MAX(%s)+1 AS seqrestart FROM %s;" % (columnname, tablename) | ||
| + | cur.execute(maxsql) | ||
| + | res = cur.fetchall() | ||
| + | seqrestart = res[0].seqrestart | ||
| + | if seqrestart > 0: | ||
| + | seqsql = "ALTER SEQUENCE %s RESTART WITH %s;" % (seqname, seqrestart) | ||
| + | print seqsql | ||
| + | cur.execute(seqsql) | ||
| + | seqname = "" | ||
| + | |||
| + | cur.close() | ||
| + | conn.commit() | ||
| + | del cur, conn | ||
| + | return 0 | ||
| + | |||
| + | if __name__ == "__main__": | ||
| + | sys.exit(main("/path/to/your_schema.sql")) | ||
| + | </pre> | ||
Latest revision as of 02:25, 24 October 2014
The following table shows how to get around some of the weird MySQL-specific, non-standard SQL syntax.
- Note
- After a bunch of editing here, I found this "Converting MySQL to PostgreSQL" wikibook.
| MySQL | PostgreSQL | Supported by both? |
| LIMIT offset, count | LIMIT count OFFSET offset | Yes |
| MATCH (column(s)) AGAINST query | ftsindex @@ query::tsquery | No(1) |
| IFNULL(expr1, expr2) | CASE WHEN expr1 IS NULL THEN expr2 ELSE expr1 END | Yes |
| REPLACE INTO table (pkcolumn, column2) VALUES ('key', 'value2') |
INSERT INTO table (pkcolumn, column2) SELECT 'key', 'value2' FROM table WHERE 'key' NOT IN (SELECT pkcolumn FROM table) LIMIT 1 |
No |
| BOOLEAN INT INT(n) INTEGER(n) TINYINT TINYINT(n) TINYINTEGER TINYINTEGER(n) MEDIUMINT MEDIUMINT(n) MEDIUMINTEGER MEDIUMINTEGER(n) SMALLINT SMALLINT(n) SMALLINTEGER SMALLINTEGER(n) |
INTEGER | Yes(2) |
| BOOLEAN | INTEGER CHECK (column IN ('0', '1')) | No |
| INTEGER UNSIGNED | INTEGER CHECK (column > -1) | No |
| INTEGER SIGNED | INTEGER | Yes |
| INTEGER AUTO_INCREMENT | SERIAL PRIMARY KEY | No(3) |
| VARCHAR(n) CHAR(n) LONGTEXT BLOB LONGBLOB |
TEXT | Yes(4) |
| DATETIME DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00' |
TIMESTAMP TIMESTAMP NOT NULL DEFAULT NOW() |
No |
| '0000-00-00 00:00:00' | NOW() | No |
| CHARACTER SET UTF8 COLLATE UTF8_BIN | No(5) | |
| column SET('value1', 'value2') | column TEXT CHECK (column IN ('value1', 'value2')) | No(6) |
| <columndef> COMMENT comment | COMMENT ON COLUMN column IS 'comment' | No |
| [UNIQUE] KEY indexname (column1, column2) | CREATE [UNIQUE] INDEX indexname ON tablename (column1, column2) | Yes |
Notes
- PostgreSQL full text searching is done with tsearch2, now integrated into PostgreSQL 8.3 or in contrib for earlier versions.
- PostgreSQL has INT2 (SMALLINT), INT4 (INTEGER) and INT8 (BIGINT), but use INTEGER unless you want ones > 231 in which case use BIGINT.
- The SERIAL type is an auto-incrementing INTEGER sequence (implicitly created at CREATE TABLE time), maintained externally from the table (akin to an index). Autoincrementing integers suck in distributed scenarios, consider a UUID or composite key instead.
- PostgreSQL couldn't care less, and stores them all as varying text anyway. Insisting on a length won't make it faster or use less disk space.
- Create your database with UTF8 encoding.
- As of 8.3 one can now go
CREATE TYPE type AS ENUM ('value1', 'value2')
column type
Useful magic MySQL incantation:
mysqldump fez_svn_empty -v -n --compatible=ansi,postgresql --complete-insert=TRUE --extended-insert=FALSE --compact --default-character-set=UTF8 -r fez_svn_empty.sql
For now, remove all KEY and FULLTEXT KEY declarations from the above sql file
Importing rows from a MySQL dump that uses INTEGER AUTO_INCREMENT for primary keys into a new PostgreSQL database using SERIAL primary keys is problematic. Options include:
- reset the PostgreSQL sequences to start from MAX(idcolumn) + 1.
SELECT MAX(idcolumn) + 1 AS startvalue
ALTER SEQUENCE sequence RESTART WITH startvalue - edit the dump to remove idcolumn values from the INSERT statements (highly tedious and probably not easily scriptable)
Script To Update Sequences
A quick dirty script to update sequences to start from above the maximum value of the id field. Adjust to requirements.
#!/usr/bin/python
import sys
import string
from pyPgSQL import PgSQL
def main(filename):
try:
conn = PgSQL.connect(database='dbname', user='username', host='localhost', password='password')
cur = conn.cursor()
except PgSQL.Error, msg:
print "*** PostgreSQL Connection Error: %s" % msg
return 1
try:
f = open(filename, "r")
except:
print "*** File not found: %s" % filename
return 1
for line in f:
if (line.find("CREATE TABLE ") != -1):
tablename = line.split()[2]
seqname = ""
elif (line.find(" SERIAL ") != -1):
columnname = line.split()[0]
seqname = "\"%s_%s_seq\"" % (tablename.strip('"'), columnname.strip('"'))
if seqname != "":
maxsql = "SELECT MAX(%s)+1 AS seqrestart FROM %s;" % (columnname, tablename)
cur.execute(maxsql)
res = cur.fetchall()
seqrestart = res[0].seqrestart
if seqrestart > 0:
seqsql = "ALTER SEQUENCE %s RESTART WITH %s;" % (seqname, seqrestart)
print seqsql
cur.execute(seqsql)
seqname = ""
cur.close()
conn.commit()
del cur, conn
return 0
if __name__ == "__main__":
sys.exit(main("/path/to/your_schema.sql"))