-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] CALL sys.expire_partitions failed when using hive metastore and setting 'metastore.partitioned-table' = 'false'. #4873
Comments
Hi @JingFengWang ,Can you provide the table building statement?and is your table internal or external? |
Hi @yangjf2019 |
Hi @yangjf2019 -- Solution
--- a/paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveMetastoreClient.java
+++ b/paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveMetastoreClient.java
@@ -29,13 +29,10 @@ import org.apache.paimon.utils.PartitionPathUtils;
import org.apache.hadoop.hive.conf.HiveConf;
import org.apache.hadoop.hive.metastore.IMetaStoreClient;
-import org.apache.hadoop.hive.metastore.api.AlreadyExistsException;
-import org.apache.hadoop.hive.metastore.api.NoSuchObjectException;
-import org.apache.hadoop.hive.metastore.api.Partition;
-import org.apache.hadoop.hive.metastore.api.PartitionEventType;
-import org.apache.hadoop.hive.metastore.api.StorageDescriptor;
-import org.apache.hadoop.hive.metastore.api.Table;
+import org.apache.hadoop.hive.metastore.api.*;
import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
import java.util.ArrayList;
import java.util.HashMap;
@@ -54,6 +51,8 @@ public class HiveMetastoreClient implements MetastoreClient {
private static final String HIVE_LAST_UPDATE_TIME_PROP = "transient_lastDdlTime";
+ private static final Logger LOG = LoggerFactory.getLogger(HiveMetastoreClient.class);
+
private final Identifier identifier;
private final ClientPool<IMetaStoreClient, TException> clients;
@@ -154,6 +153,11 @@ public class HiveMetastoreClient implements MetastoreClient {
false));
} catch (NoSuchObjectException e) {
// do nothing if the partition not exists
+ } catch (MetaException e) {
+ // When using hive metastore and 'metastore.partitioned-table' = 'false',
+ // there is no storage partition information in the hive metasotre,
+ // so the exception should be caught here.
+ } catch (TException e) {
}
} |
Hi @JingFengWang ,I tried to reproduce your problem, environment: spark 3.2, paimon 0.9.0,hive 3.1.3 but not using spark-sql client.The call command is working! def main(args: Array[String]): Unit = {
val catalogName ="your_paimon_catalog_name"
val paimonDatabase = "db"
val table = "tb"
val thriftServer = "thrift://localhost:9083"
val warehouse = "hdfs://hadoop.single.node:9000/user/hive/warehouse"
val spark =
SparkSession
.builder()
.appName(PaimonHiveCatalogExpireApp.getClass.getSimpleName)
.config(s"spark.sql.catalog.$catalogName", "org.apache.paimon.spark.SparkCatalog")
.config(s"spark.sql.catalog.$catalogName.metastore", "hive")
.config(s"spark.sql.catalog.$catalogName.uri", thriftServer)
.config(s"spark.sql.catalog.$catalogName.warehouse", warehouse)
.config("spark.sql.extensions", "org.apache.paimon.spark.extensions.PaimonSparkSessionExtensions")
.master("local")
.getOrCreate()
spark.sql(
s"""
|use $catalogName
|""".stripMargin)
spark.sql(
s"""
|create database $paimonDatabase
|""".stripMargin)
spark.sql(
s"""
|use $paimonDatabase
|""".stripMargin)
spark.sql(
s"""
|CREATE TABLE $paimonDatabase.$table (
| dt BIGINT COMMENT '时间戳,毫秒',
| randomnum INT COMMENT '随机数',
| version STRING COMMENT 'xx',
| day STRING COMMENT 'day',
| hour STRING COMMENT 'hour'
|) USING paimon
|PARTITIONED BY (day, hour)
|TBLPROPERTIES (
| 'write-only' = 'true',
| 'write-buffer-spillable' = 'true',
| 'write-buffer-for-append' = 'true',
| 'file.format' = 'orc',
| 'file.compression' = 'zstd',
| 'target-file-size' = '536870912',
| 'bucket' = '80',
| 'bucket-key' = 'randomnum'
|)
|
|""".stripMargin)
spark.sql(
s"""
|CALL sys.expire_partitions(
| table => '$paimonDatabase.$table',
| expiration_time => '1 d',
| timestamp_formatter => 'yyyy-MM-dd'
| )
|
|""".stripMargin)
} |
Hi @yangjf2019 The problem occurs in version paimon-1.0. But 0.9 also has this problem.I tested 0.9. |
Well,I try to write some data for the expire partition.And see it tommorow. |
Late reply! I have the following problems.
|
There is no reproduction of your problem. |
Search before asking
Paimon version
release-1.0
Compute Engine
spark-3.2.0
Minimal reproduce step
step 1: To create a date partitioned table, set 'metastore.partitioned-table' = 'false'
step 2: Write the test data of the last N days
step 3: CALL sys.expire_partitions(table => 'db.tb', expiration_time => '1 d', timestamp_formatter => 'yyyy-MM-dd');
What doesn't meet your expectations?
An exception is thrown when sys.expire_partitions is executed and no snapshot is generated, so the partitions that need to expire after the snapshot expires will not be physically deleted.
Anything else?
Exception information:
spark-sql> CALL sys.expire_partitions(table => 'db.tb', expiration_time => '1 d', timestamp_formatter => 'yyyy-MM-dd');
25/01/08 17:50:18 ERROR SparkSQLDriver: Failed in [CALL sys.expire_partitions(table => 'db.tb', expiration_time => '1 d', timestamp_formatter => 'yyyy-MM-dd')]
java.lang.RuntimeException: MetaException(message:Invalid partition key & values; keys [], values [2025-01-01, 15, ])
at org.apache.paimon.operation.PartitionExpire.deleteMetastorePartitions(PartitionExpire.java:175)
at org.apache.paimon.operation.PartitionExpire.doExpire(PartitionExpire.java:162)
at org.apache.paimon.operation.PartitionExpire.expire(PartitionExpire.java:139)
at org.apache.paimon.operation.PartitionExpire.expire(PartitionExpire.java:109)
at org.apache.paimon.spark.procedure.ExpirePartitionsProcedure.lambda$call$2(ExpirePartitionsProcedure.java:115)
at org.apache.paimon.spark.procedure.BaseProcedure.execute(BaseProcedure.java:88)
at org.apache.paimon.spark.procedure.BaseProcedure.modifyPaimonTable(BaseProcedure.java:78)
at org.apache.paimon.spark.procedure.ExpirePartitionsProcedure.call(ExpirePartitionsProcedure.java:87)
at org.apache.paimon.spark.execution.PaimonCallExec.run(PaimonCallExec.scala:32)
at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: