Skip to content

Conversation

vikramahuja1001
Copy link
Contributor

  1. Expose a DDL to set default partition name at table level
  2. Restrict set hive.exec.default.partition at runtime
  3. use table property to check the default partition along with hive.exec.default.partition at cluster level

For more details refer: https://issues.apache.org/jira/browse/HIVE-29116

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

@vikramahuja1001
Copy link
Contributor Author

@deniskuzZ , @chinnaraolalam could you please take a look at this PR?

Copy link
Contributor

@ngsg ngsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vikramahuja1001, I have the same concern about setting the default partition name, and I'm glad to see the patch addressing it. I have reviewed the changes, except for HiveAlterHandler.java (I need more time to study the changes and the relevant context), and left comments on the places where I have questions. Could you please review my comments on the changes?


set hive.msck.path.validation=skip;

MSCK REPAIR TABLE tbl_y;

SHOW PARTITIONS tbl_y;

SET hive.exec.default.partition.name=SECOND_PARTITION;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test to validate hive.exec.default.partition.name is no more session configurable

@vikramahuja1001
Copy link
Contributor Author

vikramahuja1001 commented Aug 13, 2025

Thanks for reviewing this PR @ngsg @Indhumathi27
I will handle the review comments, push the changes soon.

@Aggarwal-Raghav
Copy link
Contributor

@vikramahuja1001 , i think change is required in org.apache.hadoop.hive.ql.exec.DDLPlanUtils#checkIfDefaultPartition()

if (pt.contains(HIVE_DEFAULT_PARTITION)) {

@Aggarwal-Raghav
Copy link
Contributor

Aggarwal-Raghav commented Aug 14, 2025

Remove hardcoding from here as well + update comments as well accordingly
org.apache.hadoop.hive.common.FileUtils#escapePathName(java.lang.String, java.lang.String)

return "__HIVE_DEFAULT_PARTITION__";

@Aggarwal-Raghav
Copy link
Contributor

Remove hardcoding from following place:

dynPathSpec = dynPathSpec.replace("__HIVE_DEFAULT_PARTITION__", "*");

Copy link

@vikramahuja1001
Copy link
Contributor Author

@ngsg , @Aggarwal-Raghav , @Indhumathi27 i have reworked all the comments and pushed all the changes. In the build pipeline all the tests are passing but there seems to be javadoc issue with the thrift gen code, no idea what's causing that. Looking at that atm.

Could you guys please re-review it again?

@chinnaraolalam and @deniskuzZ , could you also please take a look at this PR and maybe review it?

@ngsg
Copy link
Contributor

ngsg commented Aug 28, 2025

Thanks a lot for the hard work on this PR. I haven't fully reviewed all the changes yet (the diff is quite large), but let me leave a few comments for now:

  • In some cases, handling default partition name isn't necessary. I would suggest introducing new methods with Preconditions for those cases to avoid passing {tableParam, conf} or a default partition name. For example, when calling Warehouse#escapePathName in Warehouse#makePartNameUtil: e.getKey() doesn't require a default partition name, and e.getValue() is guaranteed not to be empty or null.

  • Extending the above point, I would suggest passing just a default partition name instead of both tableParam and configuration whenever possible. For example, since FileUtils#escapePathNames already takes defaultPath, it would be better to require callers to always pass a proper default partition name. This way, we propagate only the minimal required information and lower the risk of human error when dealing with a potentially mutable map instance.

  • Regarding InsertHiveLocksCommand, I would suggest considering alternatives. Do we really need to ship the entire content of tableParams over LockComponent? Could we instead just send the default partition name, or ensure that the partition name is not null or empty? (We might even assume there is no empty string or null based on the implementation of com.google.common.base.Splitter$MapSplitter#split, though this code is outside our control.)

  • I haven't fully thought this through, but since the changes now affect the Thrift interface, I would like to hear others' opinions. What do you think about extending the definition of Table and Partition in the Thrift interface instead of using tableParams? In this way, we might be able to simplify default partition determining steps.

Also, regarding the error in Javadoc, it seems the issue comes from the javadoc comment in LockComponentBuilder#setTableParams: the parameter name should be tableParams, not tableName.

@vikramahuja1001
Copy link
Contributor Author

@ngsg , thanks for spending time to review this PR and providing your feedback. I will rework as per your above comment and push the changes in a few days. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants