Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] An exception occurs when flink task writes to paimon secondary partition #4949

Open
2 tasks done
GangYang-HX opened this issue Jan 20, 2025 · 2 comments
Open
2 tasks done
Labels
bug Something isn't working

Comments

@GangYang-HX
Copy link
Contributor

GangYang-HX commented Jan 20, 2025

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

paimon-0.9.0
paimon-1.0.0

Compute Engine

Flink-1.18.1

Minimal reproduce step

Pamon table creation script:
CREATE TABLE paimon_test.user_behavior_fusion_di_test_13(
seq int COMMENT 'from deserializer',
meta_id int COMMENT 'from deserializer',
service_id string COMMENT 'from deserializer',
client_timestamp bigint COMMENT 'from deserializer',
server_timestamp bigint COMMENT 'from deserializer',
props map<string,string> COMMENT 'from deserializer',
channel_name string COMMENT 'from deserializer',
os_name string COMMENT 'from deserializer',
version_no string COMMENT 'from deserializer',
session_id string COMMENT 'from deserializer',
device_id string COMMENT 'from deserializer',
device_name string COMMENT 'from deserializer',
device_type_name string COMMENT 'from deserializer',
user_id bigint COMMENT 'from deserializer',
app_id int COMMENT 'from deserializer',
bid int COMMENT 'from deserializer',
extra_attribute map<string,string> COMMENT 'from deserializer',
client_send_timestamp bigint COMMENT 'from deserializer',
insert_timestamp timestamp COMMENT 'from deserializer',
uncommon_map map<string,string> COMMENT 'from deserializer',
random_column int COMMENT 'from deserializer',
data_version int COMMENT 'from deserializer',
location_code bigint COMMENT 'from deserializer',
duration_time bigint COMMENT 'from deserializer',
device_id_code bigint COMMENT 'from deserializer',
is_lock_exposed boolean COMMENT 'from deserializer',
from_back int COMMENT 'from deserializer')
PARTITIONED BY (
dt string COMMENT '日期, yyyyMMdd',
event_id string COMMENT '事件类型')
ROW FORMAT SERDE
'org.apache.paimon.hive.PaimonSerDe'
STORED BY
'org.apache.paimon.hive.PaimonStorageHandler'
WITH SERDEPROPERTIES (
'serialization.format'='1')
TBLPROPERTIES (
'bucket'='-1',
'bucketing_version'='2',
'file-index.bitmap.columns'='app_id,meta_id,bid',
'file.format'='parquet',
'num-sorted-run.stop-trigger'='30',
'parquet.compression'='zstd',
'partition.expiration-check-interval'='1d',
'partition.expiration-time'='365d',
'partition.timestamp-formatter'='yyyyMMdd',
'partition.timestamp-pattern'='$dt',
'snapshot.expire.limit'='8',
'snapshot.num-retained.min'='16',
'snapshot.time-retained'='6h',
'sort-spill-threshold'='10',
'target-file-size'='1GB',
'transient_lastDdlTime'='1737100343',
'write-buffer-size'='512MB',
'write-manifest-cache'='1GB')

The task will generally succeed in the first checkpoint. In terms of performance, it will be written successfully when there is no partition information, and an exception will be reported later.

What doesn't meet your expectations?

2025-01-20 11:12:02 java.lang.RuntimeException: MetaException(message:Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MPartition' AS NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME,A0.PART_IDFROMPARTITIONS A0LEFT OUTER JOINTBLS B0ONA0.TBL_ID=B0.TBL_IDLEFT OUTER JOINDBS C0ONB0.DB_ID=C0.DB_IDWHEREB0.TBL_NAME= ? ANDC0.NAME= ? ANDA0.PART_NAME= ? ANDC0.CTLG_NAME= ?) at org.apache.paimon.metastore.AddPartitionCommitCallback.addPartitions(AddPartitionCommitCallback.java:95) at org.apache.paimon.metastore.AddPartitionCommitCallback.retry(AddPartitionCommitCallback.java:76) at org.apache.paimon.operation.FileStoreCommitImpl.lambda$filterCommitted$0(FileStoreCommitImpl.java:247) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.apache.paimon.operation.FileStoreCommitImpl.filterCommitted(FileStoreCommitImpl.java:247) at org.apache.paimon.table.sink.TableCommitImpl.filterAndCommitMultiple(TableCommitImpl.java:244) at org.apache.paimon.flink.sink.StoreCommitter.filterAndCommit(StoreCommitter.java:119) at org.apache.paimon.flink.sink.Committer.filterAndCommit(Committer.java:60) at org.apache.paimon.flink.sink.RestoreAndFailCommittableStateManager.recover(RestoreAndFailCommittableStateManager.java:82) at org.apache.paimon.flink.sink.RestoreAndFailCommittableStateManager.initializeState(RestoreAndFailCommittableStateManager.java:77) at org.apache.paimon.flink.sink.CommitterOperator.initializeState(CommitterOperator.java:142) at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:274) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:753) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:728) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:693) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:922) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) at java.lang.Thread.run(Thread.java:750) Caused by: MetaException(message:Exception thrown when executing query : SELECT DISTINCT 'org.apache.hadoop.hive.metastore.model.MPartition' ASNUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.PART_NAME,A0.PART_IDFROMPARTITIONS A0LEFT OUTER JOINTBLS B0ONA0.TBL_ID=B0.TBL_IDLEFT OUTER JOINDBS C0ONB0.DB_ID=C0.DB_IDWHEREB0.TBL_NAME= ? ANDC0.NAME= ? ANDA0.PART_NAME= ? ANDC0.CTLG_NAME= ?) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$add_partitions_req_result$add_partitions_req_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$add_partitions_req_result$add_partitions_req_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$add_partitions_req_result.read(ThriftHiveMetastore.java) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_partitions_req(ThriftHiveMetastore.java:2488) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.add_partitions_req(ThriftHiveMetastore.java:2475) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.add_partitions(HiveMetaStoreClient.java:695) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) at com.sun.proxy.$Proxy27.add_partitions(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2773) at com.sun.proxy.$Proxy28.add_partitions(Unknown Source) at org.apache.paimon.hive.HiveMetastoreClient.lambda$addPartitions$3(HiveMetastoreClient.java:107) at org.apache.paimon.client.ClientPool$ClientPoolImpl.lambda$execute$0(ClientPool.java:80) at org.apache.paimon.client.ClientPool$ClientPoolImpl.run(ClientPool.java:68) at org.apache.paimon.client.ClientPool$ClientPoolImpl.execute(ClientPool.java:77) at org.apache.paimon.hive.pool.CachedClientPool.execute(CachedClientPool.java:139) at org.apache.paimon.hive.HiveMetastoreClient.addPartitions(HiveMetastoreClient.java:107) at org.apache.paimon.metastore.AddPartitionCommitCallback.addPartitions(AddPartitionCommitCallback.java:88) ... 22 more

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@GangYang-HX GangYang-HX added the bug Something isn't working label Jan 20, 2025
@GangYang-HX
Copy link
Contributor Author

GangYang-HX commented Jan 20, 2025

Image

Image

The data will be written successfully once

@GangYang-HX
Copy link
Contributor Author

Paimon Catalog:

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant