summaryrefslogtreecommitdiff
path: root/gcc/tree-vectorizer.h
AgeCommit message (Collapse)Author
2020-05-08move permutation validity checkRichard Biener
This delays the SLP permutation check to vectorizable_load and optimizes permutations only after all SLP instances have been generated and the vectorization factor is determined. 2020-05-08 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (vec_info::slp_loads): New. (vect_optimize_slp): Declare. * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do nothing when there are no loads. (vect_gather_slp_loads): Gather loads into a vector. (vect_supported_load_permutation_p): Remove. (vect_analyze_slp_instance): Do not verify permutation validity here. (vect_analyze_slp): Optimize permutations of reductions after all SLP instances have been gathered and gather all loads. (vect_optimize_slp): New function split out from vect_supported_load_permutation_p. Elide some permutations. (vect_slp_analyze_bb_1): Call vect_optimize_slp. * tree-vect-loop.c (vect_analyze_loop_2): Likewise. * tree-vect-stmts.c (vectorizable_load): Check whether the load can be permuted. When generating code assert we can. * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported SLP permutations becoming builds from scalars. * gcc.dg/vect/bb-slp-pr78205.c: Likewise. * gcc.dg/vect/bb-slp-34.c: Likewise.
2020-05-06Prepare removal of SLP_INSTANCE_GROUP_SIZERichard Biener
This removes trivial instances of SLP_INSTANCE_GROUP_SIZE and refrains from using a "SLP instance" which nowadays is just one of the possibly many entries into the SLP graph. 2020-05-06 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (vect_transform_slp_perm_load): Adjust. * tree-vect-data-refs.c (vect_slp_analyze_node_dependences): Remove slp_instance parameter, just iterate over all scalar stmts. (vect_slp_analyze_instance_dependence): Adjust and likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Remove unused BB parameter. (vect_schedule_slp): Just iterate over all scalar stmts. (vect_supported_load_permutation_p): Adjust. (vect_transform_slp_perm_load): Remove slp_instance parameter, instead use the number of lanes in the node as group size. * tree-vect-stmts.c (vect_model_load_cost): Get vectorization factor instead of slp_instance as parameter. (vectorizable_load): Adjust.
2020-05-05add vec_info * parameters where neededRichard Biener
Soonish we'll get SLP nodes which have no corresponding scalar stmt and thus not stmt_vec_info and thus no way to get back to the associated vec_info. This patch makes the vec_info available as part of the APIs instead of putting in that back-pointer into the leaf data structures. 2020-05-05 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::vinfo): Remove. (STMT_VINFO_LOOP_VINFO): Likewise. (STMT_VINFO_BB_VINFO): Likewise. * tree-vect-data-refs.c: Adjust for the above, adding vec_info * parameters and adjusting calls. * tree-vect-loop-manip.c: Likewise. * tree-vect-loop.c: Likewise. * tree-vect-patterns.c: Likewise. * tree-vect-slp.c: Likewise. * tree-vect-stmts.c: Likewise. * tree-vectorizer.c: Likewise. * target.def (add_stmt_cost): Add vec_info * parameter. * target.h (stmt_in_inner_loop_p): Likewise. * targhooks.c (default_add_stmt_cost): Adjust. * doc/tm.texi: Re-generate. * config/aarch64/aarch64.c (aarch64_extending_load_p): Add vec_info * parameter and adjust. (aarch64_sve_adjust_stmt_cost): Likewise. (aarch64_add_stmt_cost): Likewise. * config/arm/arm.c (arm_add_stmt_cost): Likewise. * config/i386/i386.c (ix86_add_stmt_cost): Likewise. * config/rs6000/rs6000.c (rs6000_add_stmt_cost): Likewise.
2020-01-20tree-optimization/93094 pass down VECTORIZED_CALL to versioningRichard Biener
When versioning is run the IL is already mangled and finding a VECTORIZED_CALL IFN can fail. 2020-01-20 Richard Biener <rguenther@suse.de> PR tree-optimization/93094 * tree-vectorizer.h (vect_loop_versioning): Adjust. (vect_transform_loop): Likewise. * tree-vectorizer.c (try_vectorize_loop_1): Pass down loop_vectorized_call to vect_transform_loop. * tree-vect-loop.c (vect_transform_loop): Pass down loop_vectorized_call to vect_loop_versioning. * tree-vect-loop-manip.c (vect_loop_versioning): Use the earlier discovered loop_vectorized_call. * gcc.dg/vect/pr93094.c: New testcase.
2020-01-14hash-table.h: support non-zero empty values in empty_slow (v2)David Malcolm
gcc/cp/ChangeLog: * cp-gimplify.c (source_location_table_entry_hash::empty_zero_p): New static constant. * cp-tree.h (named_decl_hash::empty_zero_p): Likewise. (struct named_label_hash::empty_zero_p): Likewise. * decl2.c (mangled_decl_hash::empty_zero_p): Likewise. gcc/ChangeLog: * attribs.c (excl_hash_traits::empty_zero_p): New static constant. * gcov.c (function_start_pair_hash::empty_zero_p): Likewise. * graphite.c (struct sese_scev_hash::empty_zero_p): Likewise. * hash-map-tests.c (selftest::test_nonzero_empty_key): New selftest. (selftest::hash_map_tests_c_tests): Call it. * hash-map-traits.h (simple_hashmap_traits::empty_zero_p): New static constant, using the value of = H::empty_zero_p. (unbounded_hashmap_traits::empty_zero_p): Likewise, using the value from default_hash_traits <Value>. * hash-map.h (hash_map::empty_zero_p): Likewise, using the value from Traits. * hash-set-tests.c (value_hash_traits::empty_zero_p): Likewise. * hash-table.h (hash_table::alloc_entries): Guard the loop of calls to mark_empty with !Descriptor::empty_zero_p. (hash_table::empty_slow): Conditionalize the memset call with a check that Descriptor::empty_zero_p; otherwise, loop through the entries calling mark_empty on them. * hash-traits.h (int_hash::empty_zero_p): New static constant. (pointer_hash::empty_zero_p): Likewise. (pair_hash::empty_zero_p): Likewise. * ipa-devirt.c (default_hash_traits <type_pair>::empty_zero_p): Likewise. * ipa-prop.c (ipa_bit_ggc_hash_traits::empty_zero_p): Likewise. (ipa_vr_ggc_hash_traits::empty_zero_p): Likewise. * profile.c (location_triplet_hash::empty_zero_p): Likewise. * sanopt.c (sanopt_tree_triplet_hash::empty_zero_p): Likewise. (sanopt_tree_couple_hash::empty_zero_p): Likewise. * tree-hasher.h (int_tree_hasher::empty_zero_p): Likewise. * tree-ssa-sccvn.c (vn_ssa_aux_hasher::empty_zero_p): Likewise. * tree-vect-slp.c (bst_traits::empty_zero_p): Likewise. * tree-vectorizer.h (default_hash_traits<scalar_cond_masked_key>::empty_zero_p): Likewise.
2020-01-10[vect] Add missing commentAndre Vieira
gcc/ChangeLog: 2020-01-10 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vectorizer.h (get_dr_vinfo_offset): Add missing function comment. From-SVN: r280108
2020-01-10[vect] Keep track of DR_OFFSET advance in dr_vec_info rather than data_referenceAndre Vieira
gcc/ChangeLog: 2020-01-10 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref): Use get_dr_vinfo_offset * tree-vect-loop.c (update_epilogue_loop_vinfo): Remove orig_drs_init parameter and its use to reset DR_OFFSET's. (vect_transform_loop): Remove orig_drs_init argument. * tree-vect-loop-manip.c (vect_update_init_of_dr): Update the offset member of dr_vec_info rather than the offset of the associated data_reference's innermost_loop_behavior. (vect_update_init_of_dr): Pass dr_vec_info instead of data_reference. (vect_do_peeling): Remove orig_drs_init parameter and its construction. * tree-vect-stmts.c (check_scan_store): Replace use of DR_OFFSET with get_dr_vinfo_offset. (vectorizable_store): Likewise. (vectorizable_load): Likewise. From-SVN: r280107
2020-01-01Update copyright years.Jakub Jelinek
From-SVN: r279813
2019-11-29Record the vector mask precision in stmt_vec_infoRichard Sandiford
search_type_for_mask uses a worklist to search a chain of boolean operations for a natural vector mask type. This patch instead does that in vect_determine_stmt_precisions, where we also look for overpromoted integer operations. We then only need to compute the precision once and can cache it in the stmt_vec_info. The new function vect_determine_mask_precision is supposed to handle exactly the same cases as search_type_for_mask_1, and in the same way. There's a lot we could improve here, but that's not stage 3 material. I wondered about sharing mask_precision with other fields like operation_precision, but in the end that seemed too dangerous. We have patterns to convert between boolean and non-boolean operations and it would be very easy to get mixed up about which case the fields are describing. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (stmt_vec_info::mask_precision): New field. (vect_use_mask_type_p): New function. * tree-vect-patterns.c (vect_init_pattern_stmt): Copy the mask precision to the pattern statement. (append_pattern_def_seq): Add a scalar_type_for_mask parameter and use it to initialize the new stmt's mask precision. (search_type_for_mask_1): Delete. (search_type_for_mask): Replace with... (integer_type_for_mask): ...this new function. Use the information cached in the stmt_vec_info. (vect_recog_bool_pattern): Update accordingly. (build_mask_conversion): Pass the scalar type associated with the mask type to append_pattern_def_seq. (vect_recog_mask_conversion_pattern): Likewise. Call integer_type_for_mask instead of search_type_for_mask. (vect_convert_mask_for_vectype): Call integer_type_for_mask instead of search_type_for_mask. (possible_vector_mask_operation_p): New function. (vect_determine_mask_precision): Likewise. (vect_determine_stmt_precisions): Call it. From-SVN: r278850
2019-11-29Make vect_get_mask_type_for_stmt take a group sizeRichard Sandiford
This patch makes vect_get_mask_type_for_stmt and get_mask_type_for_scalar_type take a group size instead of the SLP node, so that later patches can call it before an SLP node has been built. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (get_mask_type_for_scalar_type): Replace the slp_tree parameter with a group size parameter. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-slp.c (vect_slp_analyze_node_operations_1): Update call accordingly. From-SVN: r278849
2019-11-16Optionally pick the cheapest loop_vec_infoRichard Sandiford
This patch adds a mode in which the vectoriser tries each available base vector mode and picks the one with the lowest cost. The new behaviour is selected by autovectorize_vector_modes. The patch keeps the current behaviour of preferring a VF of loop->simdlen over any larger or smaller VF, regardless of costs or target preferences. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (VECT_COMPARE_COSTS): New constant. * target.def (autovectorize_vector_modes): Return a bitmask of flags. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_modes): Update accordingly. * targhooks.c (default_autovectorize_vector_modes): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes): Likewise. * config/arc/arc.c (arc_autovectorize_vector_modes): Likewise. * config/arm/arm.c (arm_autovectorize_vector_modes): Likewise. * config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise. * config/mips/mips.c (mips_autovectorize_vector_modes): Likewise. * tree-vectorizer.h (_loop_vec_info::vec_outside_cost) (_loop_vec_info::vec_inside_cost): New member variables. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them. (vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions. (vect_analyze_loop): When autovectorize_vector_modes returns VECT_COMPARE_COSTS, try vectorizing the loop with each available vector mode and picking the one with the lowest cost. (vect_estimate_min_profitable_iters): Record the computed costs in the loop_vec_info. From-SVN: r278336
2019-11-16Extend can_duplicate_and_interleave_p to mixed-size vectorsRichard Sandiford
This patch makes can_duplicate_and_interleave_p cope with mixtures of vector sizes, by using queries based on get_vectype_for_scalar_type instead of directly querying GET_MODE_SIZE (vinfo->vector_mode). int_mode_for_size is now the first check we do for a candidate mode, so it seemed better to restrict it to MAX_FIXED_MODE_SIZE. This avoids unnecessary work and avoids trying to create scalar types that the target might not support. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (can_duplicate_and_interleave_p): Take an element type rather than an element mode. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. Use get_vectype_for_scalar_type to query the natural types for a given element type rather than basing everything on GET_MODE_SIZE (vinfo->vector_mode). Limit int_mode_for_size query to MAX_FIXED_MODE_SIZE. (duplicate_and_interleave): Update call accordingly. * tree-vect-loop.c (vectorizable_reduction): Likewise. From-SVN: r278335
2019-11-16Apply maximum nunits for BB SLPRichard Sandiford
The BB vectoriser picked vector types in the same way as the loop vectoriser: it picked a vector mode/size for the region and then based all the vector types off that choice. This meant we could end up trying to use vector types that had too many elements for the group size. The main part of this patch is therefore about passing the SLP group size down to routines like get_vectype_for_scalar_type and ensuring that each vector type in the SLP tree is chosen wrt the group size. That part in itself is pretty easy and mechanical. The main warts are: (1) We normally pick a STMT_VINFO_VECTYPE for data references at an early stage (vect_analyze_data_refs). However, nothing in the BB vectoriser relied on this, or on the min_vf calculated from it. I couldn't see anything other than vect_recog_bool_pattern that tried to access the vector type before the SLP tree is built. (2) It's possible for the same statement to be used in groups of different sizes. Taking the group size into account meant that we could try to pick different vector types for the same statement. This problem should go away with the move to doing everything on SLP trees, where presumably we would attach the vector type to the SLP node rather than the stmt_vec_info. Until then, the patch just uses a first-come, first-served approach. (3) A similar problem exists for grouped data references, where different statements in the same dataref group could be used in SLP nodes that have different group sizes. The patch copes with that by making sure that all vector types in a dataref group remain consistent. The patch means that: void f (int *x, short *y) { x[0] += y[0]; x[1] += y[1]; x[2] += y[2]; x[3] += y[3]; } now produces: ldr q0, [x0] ldr d1, [x1] saddw v0.4s, v0.4s, v1.4h str q0, [x0] ret instead of: ldrsh w2, [x1] ldrsh w3, [x1, 2] fmov s0, w2 ldrsh w2, [x1, 4] ldrsh w1, [x1, 6] ins v0.s[1], w3 ldr q1, [x0] ins v0.s[2], w2 ins v0.s[3], w1 add v0.4s, v0.4s, v1.4s str q0, [x0] ret Unfortunately it also means we start to vectorise gcc.target/i386/pr84101.c for -m32. That seems like a target cost issue though; see PR92265 for details. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_get_vector_types_for_stmt): Take an optional maximum nunits. (get_vectype_for_scalar_type): Likewise. Also declare a form that takes an slp_tree. (get_mask_type_for_scalar_type): Take an optional slp_tree. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-data-refs.c (vect_analyze_data_refs): Don't store the vector type in STMT_VINFO_VECTYPE for BB vectorization. * tree-vect-patterns.c (vect_recog_bool_pattern): Use vect_get_vector_types_for_stmt instead of STMT_VINFO_VECTYPE to get an assumed vector type for data references. * tree-vect-slp.c (vect_update_shared_vectype): New function. (vect_update_all_shared_vectypes): Likewise. (vect_build_slp_tree_1): Pass the group size to vect_get_vector_types_for_stmt. Use vect_update_shared_vectype for BB vectorization. (vect_build_slp_tree_2): Call vect_update_all_shared_vectypes before building the vectof from scalars. (vect_analyze_slp_instance): Pass the group size to get_vectype_for_scalar_type. (vect_slp_analyze_node_operations_1): Don't recompute the vector types for BB vectorization here; just handle the case in which we deferred the choice for booleans. (vect_get_constant_vectors): Pass the slp_tree to get_vectype_for_scalar_type. * tree-vect-stmts.c (vect_prologue_cost_for_slp_op): Likewise. (vectorizable_call): Likewise. (vectorizable_simd_clone_call): Likewise. (vectorizable_conversion): Likewise. (vectorizable_shift): Likewise. (vectorizable_operation): Likewise. (vectorizable_comparison): Likewise. (vect_is_simple_cond): Take the slp_tree as argument and pass it to get_vectype_for_scalar_type. (vectorizable_condition): Update call accordingly. (get_vectype_for_scalar_type): Take a group_size argument. For BB vectorization, limit the the vector to that number of elements. Also define an overload that takes an slp_tree. (get_mask_type_for_scalar_type): Add an slp_tree argument and pass it to get_vectype_for_scalar_type. (vect_get_vector_types_for_stmt): Add a group_size argument and pass it to get_vectype_for_scalar_type. Don't use the cached vector type for BB vectorization if a group size is given. Handle data references in that case. (vect_get_mask_type_for_stmt): Take an slp_tree argument and pass it to get_mask_type_for_scalar_type. gcc/testsuite/ * gcc.dg/vect/bb-slp-4.c: Expect the block to be vectorized with -fno-vect-cost-model. * gcc.dg/vect/bb-slp-bool-1.c: New test. * gcc.target/aarch64/vect_mixed_sizes_14.c: Likewise. * gcc.target/i386/pr84101.c: XFAIL for -m32. From-SVN: r278334
2019-11-14Avoid retrying with the same vector modesRichard Sandiford
A later patch makes the AArch64 port add four entries to autovectorize_vector_modes. Each entry describes a different vector mode assignment for vector code that mixes 8-bit, 16-bit, 32-bit and 64-bit elements. But if (as usual) the vector code has fewer element sizes than that, we could end up trying the same combination of vector modes multiple times. This patch adds a check to prevent that. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::mode_set): New typedef. (vec_info::used_vector_mode): New member variable. (vect_chooses_same_modes_p): Declare. * tree-vect-stmts.c (get_vectype_for_scalar_type): Record each chosen vector mode in vec_info::used_vector_mode. (vect_chooses_same_modes_p): New function. * tree-vect-loop.c (vect_analyze_loop): Use it to avoid trying the same vector statements multiple times. * tree-vect-slp.c (vect_slp_bb_region): Likewise. From-SVN: r278242
2019-11-14Support vectorisation with mixed vector sizesRichard Sandiford
After previous patches, it's now possible to make the vectoriser support multiple vector sizes in the same vector region, using related_vector_mode to pick the right vector mode for a given element mode. No port yet takes advantage of this, but I have a follow-on patch for AArch64. This patch also seemed like a good opportunity to add some more dump messages: one to make it clear which vector size/mode was being used when analysis passed or failed, and another to say when we've decided to skip a redundant vector size/mode. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * machmode.h (opt_machine_mode::operator==): New function. (opt_machine_mode::operator!=): Likewise. * tree-vectorizer.h (vec_info::vector_mode): Update comment. (get_related_vectype_for_scalar_type): Delete. (get_vectype_for_scalar_type_and_size): Declare. * tree-vect-slp.c (vect_slp_bb_region): Print dump messages to say whether analysis passed or failed, and with what vector modes. Use related_vector_mode to check whether trying a particular vector mode would be redundant with the autodetected mode, and print a dump message if we decide to skip it. * tree-vect-loop.c (vect_analyze_loop): Likewise. (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type instead of get_vectype_for_scalar_type_and_size. * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Replace with... (get_related_vectype_for_scalar_type): ...this new function. Take a starting/"prevailing" vector mode rather than a vector size. Take an optional nunits argument, with the same meaning as for related_vector_mode. Use related_vector_mode when not auto-detecting a mode, falling back to mode_for_vector if no target mode exists. (get_vectype_for_scalar_type): Update accordingly. (get_same_sized_vectype): Likewise. * tree-vectorizer.c (get_vec_alignment_for_array_type): Likewise. From-SVN: r278240
2019-11-14Replace vec_info::vector_size with vec_info::vector_modeRichard Sandiford
This patch replaces vec_info::vector_size with vec_info::vector_mode, but for now continues to use it as a way of specifying a single vector size. This makes it easier for later patches to use related_vector_mode instead. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::vector_size): Replace with... (vec_info::vector_mode): ...this new field. * tree-vect-loop.c (vect_update_vf_for_slp): Update accordingly. (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-loop-manip.c (vect_do_peeling): Likewise. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. (vect_make_slp_decision, vect_slp_bb_region): Likewise. * tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise. * tree-vectorizer.c (try_vectorize_loop_1): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-tail-nomask-1.c: Update expected epilogue vectorization message. From-SVN: r278237
2019-11-14Add build_truth_vector_type_for_modeRichard Sandiford
Callers of vect_halve_mask_nunits and vect_double_mask_nunits already know what mode the resulting vector type should have, so we might as well create the vector type directly with that mode, just like build_vector_type_for_mode lets us build normal vectors with a known mode. This avoids the current awkwardness of having to recompute the mode starting from vec_info::vector_size, which hard-codes the assumption that all vectors have to be the same size. A later patch gets rid of build_truth_vector_type and build_same_sized_truth_vector_type, so the net effect of the series is to reduce the number of type functions by one. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.h (build_truth_vector_type_for_mode): Declare. * tree.c (build_truth_vector_type_for_mode): New function, split out from... (build_truth_vector_type): ...here. (build_opaque_vector_type): Fix head comment. * tree-vectorizer.h (supportable_narrowing_operation): Remove vec_info parameter. (vect_halve_mask_nunits): Replace vec_info parameter with the mode of the new vector. (vect_double_mask_nunits): Likewise. * tree-vect-loop.c (vect_halve_mask_nunits): Likewise. (vect_double_mask_nunits): Likewise. * tree-vect-loop-manip.c: Include insn-config.h, rtl.h and recog.h. (vect_maybe_permute_loop_masks): Remove vinfo parameter. Update call to vect_halve_mask_nunits, getting the required mode from the unpack patterns. (vect_set_loop_condition_masked): Update call accordingly. * tree-vect-stmts.c (supportable_narrowing_operation): Remove vec_info parameter and update call to vect_double_mask_nunits. (vectorizable_conversion): Update call accordingly. (simple_integer_narrowing): Likewise. Remove vec_info parameter. (vectorizable_call): Update call accordingly. (supportable_widening_operation): Update call to vect_halve_mask_nunits. * config/aarch64/aarch64-sve-builtins.cc (register_builtin_types): Use build_truth_vector_type_mode instead of build_truth_vector_type. From-SVN: r278231
2019-11-13Avoid accounting for non-existent vector loop versioningRichard Sandiford
vect_analyze_loop_costing uses two profitability thresholds: a runtime one and a static compile-time one. The runtime one is simply the point at which the vector loop is cheaper than the scalar loop, while the static one also takes into account the cost of choosing between the scalar and vector loops at runtime. We compare this static cost against the expected execution frequency to decide whether it's worth generating any vector code at all. However, we never reclaimed the cost of applying the runtime threshold if it turned out that the vector code can always be used. And we only know whether that's true once we've calculated what the runtime threshold would be. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_apply_runtime_profitability_check_p): New function. * tree-vect-loop-manip.c (vect_loop_versioning): Use it. * tree-vect-loop.c (vect_analyze_loop_2): Likewise. (vect_transform_loop): Likewise. (vect_analyze_loop_costing): Don't take the cost of versioning into account for the static profitability threshold if it turns out that no versioning is needed. From-SVN: r278124
2019-11-13Don't assign a cost to vectorizable_assignmentRichard Sandiford
vectorizable_assignment handles true SSA-to-SSA copies (which hopefully we don't see in practice) and no-op conversions that are required to maintain correct gimple, such as changes between signed and unsigned types. These cases shouldn't generate any code and so shouldn't count against either the scalar or vector costs. Later patches test this, but it seemed worth splitting out. 2019-11-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_nop_conversion_p): Declare. * tree-vect-stmts.c (vect_nop_conversion_p): New function. (vectorizable_assignment): Don't add a cost for nop conversions. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise. From-SVN: r278122
2019-11-12Apply mechanical replacement (generated patch).Martin Liska
2019-11-12 Martin Liska <mliska@suse.cz> * asan.c (asan_sanitize_stack_p): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. (asan_sanitize_allocas_p): Likewise. (asan_emit_stack_protection): Likewise. (asan_protect_global): Likewise. (instrument_derefs): Likewise. (instrument_builtin_call): Likewise. (asan_expand_mark_ifn): Likewise. * auto-profile.c (auto_profile): Likewise. * bb-reorder.c (copy_bb_p): Likewise. (duplicate_computed_gotos): Likewise. * builtins.c (inline_expand_builtin_string_cmp): Likewise. * cfgcleanup.c (try_crossjump_to_edge): Likewise. (try_crossjump_bb): Likewise. * cfgexpand.c (defer_stack_allocation): Likewise. (stack_protect_classify_type): Likewise. (pass_expand::execute): Likewise. * cfgloopanal.c (expected_loop_iterations_unbounded): Likewise. (estimate_reg_pressure_cost): Likewise. * cgraph.c (cgraph_edge::maybe_hot_p): Likewise. * combine.c (combine_instructions): Likewise. (record_value_for_reg): Likewise. * common/config/aarch64/aarch64-common.c (aarch64_option_validate_param): Likewise. (aarch64_option_default_params): Likewise. * common/config/ia64/ia64-common.c (ia64_option_default_params): Likewise. * common/config/powerpcspe/powerpcspe-common.c (rs6000_option_default_params): Likewise. * common/config/rs6000/rs6000-common.c (rs6000_option_default_params): Likewise. * common/config/sh/sh-common.c (sh_option_default_params): Likewise. * config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Likewise. (aarch64_allocate_and_probe_stack_space): Likewise. (aarch64_expand_epilogue): Likewise. (aarch64_override_options_internal): Likewise. * config/alpha/alpha.c (alpha_option_override): Likewise. * config/arm/arm.c (arm_option_override): Likewise. (arm_valid_target_attribute_p): Likewise. * config/i386/i386-options.c (ix86_option_override_internal): Likewise. * config/i386/i386.c (get_probe_interval): Likewise. (ix86_adjust_stack_and_probe_stack_clash): Likewise. (ix86_max_noce_ifcvt_seq_cost): Likewise. * config/ia64/ia64.c (ia64_adjust_cost): Likewise. * config/rs6000/rs6000-logue.c (get_stack_clash_protection_probe_interval): Likewise. (get_stack_clash_protection_guard_size): Likewise. * config/rs6000/rs6000.c (rs6000_option_override_internal): Likewise. * config/s390/s390.c (allocate_stack_space): Likewise. (s390_emit_prologue): Likewise. (s390_option_override_internal): Likewise. * config/sparc/sparc.c (sparc_option_override): Likewise. * config/visium/visium.c (visium_option_override): Likewise. * coverage.c (get_coverage_counts): Likewise. (coverage_compute_profile_id): Likewise. (coverage_begin_function): Likewise. (coverage_end_function): Likewise. * cse.c (cse_find_path): Likewise. (cse_extended_basic_block): Likewise. (cse_main): Likewise. * cselib.c (cselib_invalidate_mem): Likewise. * dse.c (dse_step1): Likewise. * emit-rtl.c (set_new_first_and_last_insn): Likewise. (get_max_insn_count): Likewise. (make_debug_insn_raw): Likewise. (init_emit): Likewise. * explow.c (compute_stack_clash_protection_loop_data): Likewise. * final.c (compute_alignments): Likewise. * fold-const.c (fold_range_test): Likewise. (fold_truth_andor): Likewise. (tree_single_nonnegative_warnv_p): Likewise. (integer_valued_real_single_p): Likewise. * gcse.c (want_to_gcse_p): Likewise. (prune_insertions_deletions): Likewise. (hoist_code): Likewise. (gcse_or_cprop_is_too_expensive): Likewise. * ggc-common.c: Likewise. * ggc-page.c (ggc_collect): Likewise. * gimple-loop-interchange.cc (MAX_NUM_STMT): Likewise. (MAX_DATAREFS): Likewise. (OUTER_STRIDE_RATIO): Likewise. * gimple-loop-jam.c (tree_loop_unroll_and_jam): Likewise. * gimple-loop-versioning.cc (loop_versioning::max_insns_for_loop): Likewise. * gimple-ssa-split-paths.c (is_feasible_trace): Likewise. * gimple-ssa-store-merging.c (imm_store_chain_info::try_coalesce_bswap): Likewise. (imm_store_chain_info::coalesce_immediate_stores): Likewise. (imm_store_chain_info::output_merged_store): Likewise. (pass_store_merging::process_store): Likewise. * gimple-ssa-strength-reduction.c (find_basis_for_base_expr): Likewise. * graphite-isl-ast-to-gimple.c (class translate_isl_ast_to_gimple): Likewise. (scop_to_isl_ast): Likewise. * graphite-optimize-isl.c (get_schedule_for_node_st): Likewise. (optimize_isl): Likewise. * graphite-scop-detection.c (build_scops): Likewise. * haifa-sched.c (set_modulo_params): Likewise. (rank_for_schedule): Likewise. (model_add_to_worklist): Likewise. (model_promote_insn): Likewise. (model_choose_insn): Likewise. (queue_to_ready): Likewise. (autopref_multipass_dfa_lookahead_guard): Likewise. (schedule_block): Likewise. (sched_init): Likewise. * hsa-gen.c (init_prologue): Likewise. * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Likewise. (cond_move_process_if_block): Likewise. * ipa-cp.c (ipcp_lattice::add_value): Likewise. (merge_agg_lats_step): Likewise. (devirtualization_time_bonus): Likewise. (hint_time_bonus): Likewise. (incorporate_penalties): Likewise. (good_cloning_opportunity_p): Likewise. (ipcp_propagate_stage): Likewise. * ipa-fnsummary.c (decompose_param_expr): Likewise. (set_switch_stmt_execution_predicate): Likewise. (analyze_function_body): Likewise. (compute_fn_summary): Likewise. * ipa-inline-analysis.c (estimate_growth): Likewise. * ipa-inline.c (caller_growth_limits): Likewise. (inline_insns_single): Likewise. (inline_insns_auto): Likewise. (can_inline_edge_by_limits_p): Likewise. (want_early_inline_function_p): Likewise. (big_speedup_p): Likewise. (want_inline_small_function_p): Likewise. (want_inline_self_recursive_call_p): Likewise. (edge_badness): Likewise. (recursive_inlining): Likewise. (compute_max_insns): Likewise. (early_inliner): Likewise. * ipa-polymorphic-call.c (csftc_abort_walking_p): Likewise. * ipa-profile.c (ipa_profile): Likewise. * ipa-prop.c (determine_known_aggregate_parts): Likewise. (ipa_analyze_node): Likewise. (ipcp_transform_function): Likewise. * ipa-split.c (consider_split): Likewise. * ipa-sra.c (allocate_access): Likewise. (process_scan_results): Likewise. (ipa_sra_summarize_function): Likewise. (pull_accesses_from_callee): Likewise. * ira-build.c (loop_compare_func): Likewise. (mark_loops_for_removal): Likewise. * ira-conflicts.c (build_conflict_bit_table): Likewise. * loop-doloop.c (doloop_optimize): Likewise. * loop-invariant.c (gain_for_invariant): Likewise. (move_loop_invariants): Likewise. * loop-unroll.c (decide_unroll_constant_iterations): Likewise. (decide_unroll_runtime_iterations): Likewise. (decide_unroll_stupid): Likewise. (expand_var_during_unrolling): Likewise. * lra-assigns.c (spill_for): Likewise. * lra-constraints.c (EBB_PROBABILITY_CUTOFF): Likewise. * modulo-sched.c (sms_schedule): Likewise. (DFA_HISTORY): Likewise. * opts.c (default_options_optimization): Likewise. (finish_options): Likewise. (common_handle_option): Likewise. * postreload-gcse.c (eliminate_partially_redundant_load): Likewise. (if): Likewise. * predict.c (get_hot_bb_threshold): Likewise. (maybe_hot_count_p): Likewise. (probably_never_executed): Likewise. (predictable_edge_p): Likewise. (predict_loops): Likewise. (expr_expected_value_1): Likewise. (tree_predict_by_opcode): Likewise. (handle_missing_profiles): Likewise. * reload.c (find_equiv_reg): Likewise. * reorg.c (redundant_insn): Likewise. * resource.c (mark_target_live_regs): Likewise. (incr_ticks_for_insn): Likewise. * sanopt.c (pass_sanopt::execute): Likewise. * sched-deps.c (sched_analyze_1): Likewise. (sched_analyze_2): Likewise. (sched_analyze_insn): Likewise. (deps_analyze_insn): Likewise. * sched-ebb.c (schedule_ebbs): Likewise. * sched-rgn.c (find_single_block_region): Likewise. (too_large): Likewise. (haifa_find_rgns): Likewise. (extend_rgns): Likewise. (new_ready): Likewise. (schedule_region): Likewise. (sched_rgn_init): Likewise. * sel-sched-ir.c (make_region_from_loop): Likewise. * sel-sched-ir.h (MAX_WS): Likewise. * sel-sched.c (process_pipelined_exprs): Likewise. (sel_setup_region_sched_flags): Likewise. * shrink-wrap.c (try_shrink_wrapping): Likewise. * targhooks.c (default_max_noce_ifcvt_seq_cost): Likewise. * toplev.c (print_version): Likewise. (process_options): Likewise. * tracer.c (tail_duplicate): Likewise. * trans-mem.c (tm_log_add): Likewise. * tree-chrec.c (chrec_fold_plus_1): Likewise. * tree-data-ref.c (split_constant_offset): Likewise. (compute_all_dependences): Likewise. * tree-if-conv.c (MAX_PHI_ARG_NUM): Likewise. * tree-inline.c (remap_gimple_stmt): Likewise. * tree-loop-distribution.c (MAX_DATAREFS_NUM): Likewise. * tree-parloops.c (MIN_PER_THREAD): Likewise. (create_parallel_loop): Likewise. * tree-predcom.c (determine_unroll_factor): Likewise. * tree-scalar-evolution.c (instantiate_scev_r): Likewise. * tree-sra.c (analyze_all_variable_accesses): Likewise. * tree-ssa-ccp.c (fold_builtin_alloca_with_align): Likewise. * tree-ssa-dse.c (setup_live_bytes_from_ref): Likewise. (dse_optimize_redundant_stores): Likewise. (dse_classify_store): Likewise. * tree-ssa-ifcombine.c (ifcombine_ifandif): Likewise. * tree-ssa-loop-ch.c (ch_base::copy_headers): Likewise. * tree-ssa-loop-im.c (LIM_EXPENSIVE): Likewise. * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise. (try_peel_loop): Likewise. (tree_unroll_loops_completely): Likewise. * tree-ssa-loop-ivopts.c (avg_loop_niter): Likewise. (CONSIDER_ALL_CANDIDATES_BOUND): Likewise. (MAX_CONSIDERED_GROUPS): Likewise. (ALWAYS_PRUNE_CAND_SET_BOUND): Likewise. * tree-ssa-loop-manip.c (can_unroll_loop_p): Likewise. * tree-ssa-loop-niter.c (MAX_ITERATIONS_TO_TRACK): Likewise. * tree-ssa-loop-prefetch.c (PREFETCH_BLOCK): Likewise. (L1_CACHE_SIZE_BYTES): Likewise. (L2_CACHE_SIZE_BYTES): Likewise. (should_issue_prefetch_p): Likewise. (schedule_prefetches): Likewise. (determine_unroll_factor): Likewise. (volume_of_references): Likewise. (add_subscript_strides): Likewise. (self_reuse_distance): Likewise. (mem_ref_count_reasonable_p): Likewise. (insn_to_prefetch_ratio_too_small_p): Likewise. (loop_prefetch_arrays): Likewise. (tree_ssa_prefetch_arrays): Likewise. * tree-ssa-loop-unswitch.c (tree_unswitch_single_loop): Likewise. * tree-ssa-math-opts.c (gimple_expand_builtin_pow): Likewise. (convert_mult_to_fma): Likewise. (math_opts_dom_walker::after_dom_children): Likewise. * tree-ssa-phiopt.c (cond_if_else_store_replacement): Likewise. (hoist_adjacent_loads): Likewise. (gate_hoist_loads): Likewise. * tree-ssa-pre.c (translate_vuse_through_block): Likewise. (compute_partial_antic_aux): Likewise. * tree-ssa-reassoc.c (get_reassociation_width): Likewise. * tree-ssa-sccvn.c (vn_reference_lookup_pieces): Likewise. (vn_reference_lookup): Likewise. (do_rpo_vn): Likewise. * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr): Likewise. * tree-ssa-sink.c (select_best_block): Likewise. * tree-ssa-strlen.c (new_stridx): Likewise. (new_addr_stridx): Likewise. (get_range_strlen_dynamic): Likewise. (class ssa_name_limit_t): Likewise. * tree-ssa-structalias.c (push_fields_onto_fieldstack): Likewise. (create_variable_info_for_1): Likewise. (init_alias_vars): Likewise. * tree-ssa-tail-merge.c (find_clusters_1): Likewise. (tail_merge_optimize): Likewise. * tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path): Likewise. (thread_jumps::fsm_find_control_statement_thread_paths): Likewise. (thread_jumps::find_jump_threads_backwards): Likewise. * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Likewise. * tree-ssa-uninit.c (compute_control_dep_chain): Likewise. * tree-switch-conversion.c (switch_conversion::check_range): Likewise. (jump_table_cluster::can_be_handled): Likewise. * tree-switch-conversion.h (jump_table_cluster::case_values_threshold): Likewise. (SWITCH_CONVERSION_BRANCH_RATIO): Likewise. (param_switch_conversion_branch_ratio): Likewise. * tree-vect-data-refs.c (vect_mark_for_runtime_alias_test): Likewise. (vect_enhance_data_refs_alignment): Likewise. (vect_prune_runtime_alias_test_list): Likewise. * tree-vect-loop.c (vect_analyze_loop_costing): Likewise. (vect_get_datarefs_in_loop): Likewise. (vect_analyze_loop): Likewise. * tree-vect-slp.c (vect_slp_bb): Likewise. * tree-vectorizer.h: Likewise. * tree-vrp.c (find_switch_asserts): Likewise. (vrp_prop::check_mem_ref): Likewise. * tree.c (wide_int_to_tree_1): Likewise. (cache_integer_cst): Likewise. * var-tracking.c (EXPR_USE_DEPTH): Likewise. (reverse_op): Likewise. (vt_find_locations): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * gimple-parser.c (c_parser_parse_gimple_body): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. 2019-11-12 Martin Liska <mliska@suse.cz> * name-lookup.c (namespace_hints::namespace_hints): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * typeck.c (comptypes): Likewise. 2019-11-12 Martin Liska <mliska@suse.cz> * lto-partition.c (lto_balanced_map): Replace old parameter syntax with the new one, include opts.h if needed. Use SET_OPTION_IF_UNSET macro. * lto.c (do_whole_program_analysis): Likewise. From-SVN: r278085
2019-11-08re PR tree-optimization/92324 (ICE in expand_direct_optab_fn, at ↵Richard Biener
internal-fn.c:2890) 2019-11-08 Richard Biener <rguenther@suse.de> PR tree-optimization/92324 * tree-vect-loop.c (vect_create_epilog_for_reduction): Use STMT_VINFO_REDUC_VECTYPE for all computations, inserting sign-conversions as necessary. (vectorizable_reduction): Reject conversions in the chain that are not sign-conversions, base analysis on a non-converting stmt and its operation sign. Set STMT_VINFO_REDUC_VECTYPE. * tree-vect-stmts.c (vect_stmt_relevant_p): Don't dump anything for debug stmts. * tree-vectorizer.h (_stmt_vec_info::reduc_vectype): New. (STMT_VINFO_REDUC_VECTYPE): Likewise. * gcc.dg/vect/pr92205.c: XFAIL. * gcc.dg/vect/pr92324-1.c: New testcase. * gcc.dg/vect/pr92324-2.c: Likewise. From-SVN: r277955
2019-11-08Generalise gather and scatter optabsRichard Sandiford
The gather and scatter optabs required the vector offset to be the integer equivalent of the vector mode being loaded or stored. This patch generalises them so that the two vectors can have different element sizes, although they still need to have the same number of elements. One consequence of this is that it's possible (if unlikely) for two IFN_GATHER_LOADs to have the same arguments but different return types. E.g. the same scalar base and vector of 32-bit offsets could be used to load 8-bit elements and to load 16-bit elements. From just looking at the arguments, we could wrongly deduce that they're equivalent. I know we saw this happen at one point with IFN_WHILE_ULT, and we dealt with it there by passing a zero of the return type as an extra argument. Doing the same here also makes the load and store functions have the same argument assignment. For now this patch should be a no-op, but later SVE patches take advantage of the new flexibility. 2019-11-08 Richard Sandiford <richard.sandiford@arm.com> gcc/ * optabs.def (gather_load_optab, mask_gather_load_optab) (scatter_store_optab, mask_scatter_store_optab): Turn into conversion optabs, with the offset mode given explicitly. * doc/md.texi: Update accordingly. * config/aarch64/aarch64-sve-builtins-base.cc (svld1_gather_impl::expand): Likewise. (svst1_scatter_impl::expand): Likewise. * internal-fn.c (gather_load_direct, scatter_store_direct): Likewise. (expand_scatter_store_optab_fn): Likewise. (direct_gather_load_optab_supported_p): Likewise. (direct_scatter_store_optab_supported_p): Likewise. (expand_gather_load_optab_fn): Likewise. Expect the mask argument to be argument 4. (internal_fn_mask_index): Return 4 for IFN_MASK_GATHER_LOAD. (internal_gather_scatter_fn_supported_p): Replace the offset sign argument with the offset vector type. Require the two vector types to have the same number of elements but allow their element sizes to be different. Treat the optabs as conversion optabs. * internal-fn.h (internal_gather_scatter_fn_supported_p): Update prototype accordingly. * optabs-query.c (supports_at_least_one_mode_p): Replace with... (supports_vec_convert_optab_p): ...this new function. (supports_vec_gather_load_p): Update accordingly. (supports_vec_scatter_store_p): Likewise. * tree-vectorizer.h (vect_gather_scatter_fn_p): Take a vec_info. Replace the offset sign and bits parameters with a scalar type tree. * tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise. Pass back the offset vector type instead of the scalar element type. Allow the offset to be wider than the memory elements. Search for an offset type that the target supports, stopping once we've reached the maximum of the element size and pointer size. Update call to internal_gather_scatter_fn_supported_p. (vect_check_gather_scatter): Update calls accordingly. When testing a new scale before knowing the final offset type, check whether the scale is supported for any signed or unsigned offset type. Check whether the target supports the source and target types of a conversion before deciding whether to look through the conversion. Record the chosen offset_vectype. * tree-vect-patterns.c (vect_get_gather_scatter_offset_type): Delete. (vect_recog_gather_scatter_pattern): Get the scalar offset type directly from the gs_info's offset_vectype instead. Pass a zero of the result type to IFN_GATHER_LOAD and IFN_MASK_GATHER_LOAD. * tree-vect-stmts.c (check_load_store_masking): Update call to internal_gather_scatter_fn_supported_p, passing the offset vector type recorded in the gs_info. (vect_truncate_gather_scatter_offset): Update call to vect_check_gather_scatter, leaving it to search for a valid offset vector type. (vect_use_strided_gather_scatters_p): Convert the offset to the element type of the gs_info's offset_vectype. (vect_get_gather_scatter_ops): Get the offset vector type directly from the gs_info. (vect_get_strided_load_store_ops): Likewise. (vectorizable_load): Pass a zero of the result type to IFN_GATHER_LOAD and IFN_MASK_GATHER_LOAD. * config/aarch64/aarch64-sve.md (gather_load<mode>): Rename to... (gather_load<mode><v_int_equiv>): ...this. (mask_gather_load<mode>): Rename to... (mask_gather_load<mode><v_int_equiv>): ...this. (scatter_store<mode>): Rename to... (scatter_store<mode><v_int_equiv>): ...this. (mask_scatter_store<mode>): Rename to... (mask_scatter_store<mode><v_int_equiv>): ...this. From-SVN: r277949
2019-11-04[vect] Clean up orig_loop_vinfo from vect_analyze_loopAndre Vieira
gcc/ChangeLog: 2019-11-04 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-loop.c (vect_analyze_loop): Remove orig_loop_vinfo parameter. * tree-vectorizer.h (vect_analyze_loop): Update declaration. * tree-vectorizer.c (try_vectorize_loop_1): Update calls to vect_analyze_loop. From-SVN: r277785
2019-11-04[SLP] SLP vectorization: vectorize vector constructorsJoel Hutton
gcc/ChangeLog: 2019-11-04 Joel Hutton <Joel.Hutton@arm.com> * expr.c (store_constructor): Modify to handle single element vectors. * tree-vect-slp.c (vect_analyze_slp_instance): Add case for vector constructors. (vect_slp_check_for_constructors): New function. (vect_slp_analyze_bb_1): Call new function to check for vector constructors. (vectorize_slp_instance_root_stmt): New function. (vect_schedule_slp): Call new function to vectorize root stmt of vector constructors. * tree-vectorizer.h (SLP_INSTANCE_ROOT_STMT): New field. gcc/testsuite/ChangeLog: 2019-11-04 Joel Hutton <Joel.Hutton@arm.com> * gcc.dg/vect/bb-slp-40.c: New test. * gcc.dg/vect/bb-slp-41.c: New test. From-SVN: r277784
2019-10-29[vect]PR 88915: Vectorize epilogues when versioning loopsAndre Vieira
gcc/ChangeLog: 2019-10-29 Andre Vieira <andre.simoesdiasvieira@arm.com> PR 88915 * tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration. * tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter and make the valueize function pointer also take a void pointer. * gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap around vn_valueize, to call it without a context. (process_bb): Use vn_valueize_wrapper instead of vn_valueize. * tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos. (~_loop_vec_info): Release epilogue_vinfos. (vect_analyze_loop_costing): Use knowledge of main VF to estimate number of iterations of epilogue. (vect_analyze_loop_2): Adapt to analyse main loop for all supported vector sizes when vect-epilogues-nomask=1. Also keep track of lowest versioning threshold needed for main loop. (vect_analyze_loop): Likewise. (find_in_mapping): New helper function. (update_epilogue_loop_vinfo): New function. (vect_transform_loop): When vectorizing epilogues re-use analysis done on main loop and call update_epilogue_loop_vinfo to update it. * tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert stmts on loop preheader edge. (vect_do_peeling): Enable skip-vectors when doing loop versioning if we decided to vectorize epilogues. Update epilogues NITERS and construct ADVANCE to update epilogues data references where needed. * tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos. (vect_do_peeling, vect_update_inits_of_drs, determine_peel_for_niter, vect_analyze_loop): Add or update declarations. * tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already created loop_vec_info's for epilogues when available. Otherwise analyse epilogue separately. From-SVN: r277569
2019-10-21tree-vectorizer.h (_slp_tree::ops): New member.Richard Biener
2019-10-21 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_slp_tree::ops): New member. (SLP_TREE_SCALAR_OPS): New. (vect_get_slp_defs): Adjust prototype. * tree-vect-slp.c (vect_free_slp_tree): Release SLP_TREE_SCALAR_OPS. (vect_create_new_slp_node): Initialize it. New overload for initializing by an operands array. (_slp_oprnd_info::ops): New member. (vect_create_oprnd_info): Initialize it. (vect_free_oprnd_info): Release it. (vect_get_and_check_slp_defs): Populate the operands array. Do not swap operands in the IL when not necessary. (vect_build_slp_tree_2): Build SLP nodes for invariant operands. Record SLP_TREE_SCALAR_OPS for all invariant nodes. Also swap operands in the operands array. Do not swap operands in the IL. (vect_slp_rearrange_stmts): Re-arrange SLP_TREE_SCALAR_OPS as well. (vect_gather_slp_loads): Fix. (vect_detect_hybrid_slp_stmts): Likewise. (vect_slp_analyze_node_operations_1): Search for a internal def child for computing reduction SLP_TREE_NUMBER_OF_VEC_STMTS. (vect_slp_analyze_node_operations): Skip ops-only stmts for the def-type push/pop dance. (vect_get_constant_vectors): Compute number_of_vectors here. Use SLP_TREE_SCALAR_OPS and simplify greatly. (vect_get_slp_vect_defs): Use gimple_get_lhs also for PHIs. (vect_get_slp_defs): Simplify greatly. * tree-vect-loop.c (vectorize_fold_left_reduction): Simplify. (vect_transform_reduction): Likewise. * tree-vect-stmts.c (vect_get_vec_defs): Simplify. (vectorizable_call): Likewise. (vectorizable_operation): Likewise. (vectorizable_load): Likewise. (vectorizable_condition): Likewise. (vectorizable_comparison): Likewise. From-SVN: r277241
2019-10-21Replace current_vector_size with vec_info::vector_sizeRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vec_info::vector_size): New member variable. (vect_update_max_nunits): Update comment. (current_vector_size): Delete. * tree-vect-stmts.c (current_vector_size): Likewise. (get_vectype_for_scalar_type): Use vec_info::vector_size instead of current_vector_size. (get_mask_type_for_scalar_type): Likewise. * tree-vectorizer.c (try_vectorize_loop_1): Likewise. * tree-vect-loop.c (vect_update_vf_for_slp): Likewise. (vect_analyze_loop, vect_halve_mask_nunits): Likewise. (vect_double_mask_nunits, vect_transform_loop): Likewise. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. (vect_make_slp_decision, vect_slp_bb_region): Likewise. From-SVN: r277235
2019-10-21Pass a vec_info to vect_double_mask_nunitsRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_double_mask_nunits): Take a vec_info. * tree-vect-loop.c (vect_double_mask_nunits): Likewise. * tree-vect-stmts.c (supportable_narrowing_operation): Update call accordingly. From-SVN: r277234
2019-10-21Pass a vec_info to vect_halve_mask_nunitsRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_halve_mask_nunits): Take a vec_info. * tree-vect-loop.c (vect_halve_mask_nunits): Likewise. * tree-vect-loop-manip.c (vect_maybe_permute_loop_masks): Update call accordingly. * tree-vect-stmts.c (supportable_widening_operation): Likewise. From-SVN: r277233
2019-10-21Pass a vec_info to supportable_narrowing_operationRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (supportable_narrowing_operation): Take a vec_info. * tree-vect-stmts.c (supportable_narrowing_operation): Likewise. (simple_integer_narrowing): Update call accordingly. (vectorizable_conversion): Likewise. From-SVN: r277231
2019-10-21Pass a vec_info to can_duplicate_and_interleave_pRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (can_duplicate_and_interleave_p): Take a vec_info. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. (duplicate_and_interleave): Update call accordingly. * tree-vect-loop.c (vectorizable_reduction): Likewise. From-SVN: r277229
2019-10-21Pass a vec_info to duplicate_and_interleaveRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (duplicate_and_interleave): Take a vec_info. * tree-vect-slp.c (duplicate_and_interleave): Likewise. (vect_get_constant_vectors): Update call accordingly. * tree-vect-loop.c (get_initial_defs_for_reduction): Likewise. From-SVN: r277228
2019-10-21Pass a vec_info to get_vectype_for_scalar_typeRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (get_vectype_for_scalar_type): Take a vec_info. * tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise. (vect_prologue_cost_for_slp_op): Update call accordingly. (vect_get_vec_def_for_operand, vect_get_gather_scatter_ops) (vect_get_strided_load_store_ops, vectorizable_simd_clone_call) (vect_supportable_shift, vect_is_simple_cond, vectorizable_comparison) (get_mask_type_for_scalar_type): Likewise. (vect_get_vector_types_for_stmt): Likewise. * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise. * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise. (get_initial_def_for_reduction, build_vect_cond_expr): Likewise. * tree-vect-patterns.c (vect_supportable_direct_optab_p): Likewise. (vect_split_statement, vect_convert_input): Likewise. (vect_recog_widen_op_pattern, vect_recog_pow_pattern): Likewise. (vect_recog_over_widening_pattern, vect_recog_mulhs_pattern): Likewise. (vect_recog_average_pattern, vect_recog_cast_forwprop_pattern) (vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern) (vect_synth_mult_by_constant, vect_recog_mult_pattern): Likewise. (vect_recog_divmod_pattern, vect_recog_mixed_size_cond_pattern) (check_bool_pattern, adjust_bool_pattern_cast, adjust_bool_pattern) (search_type_for_mask_1, vect_recog_bool_pattern): Likewise. (vect_recog_mask_conversion_pattern): Likewise. (vect_add_conversion_to_pattern): Likewise. (vect_recog_gather_scatter_pattern): Likewise. * tree-vect-slp.c (vect_build_slp_tree_2): Likewise. (vect_analyze_slp_instance, vect_get_constant_vectors): Likewise. From-SVN: r277227
2019-10-21Pass a vec_info to get_mask_type_for_scalar_typeRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (get_mask_type_for_scalar_type): Take a vec_info. * tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise. (vect_check_load_store_mask): Update call accordingly. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-patterns.c (check_bool_pattern): Likewise. (search_type_for_mask_1, vect_recog_mask_conversion_pattern): Likewise. (vect_convert_mask_for_vectype): Likewise. From-SVN: r277226
2019-10-21Pass a vec_info to vect_supportable_shiftRichard Sandiford
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_supportable_shift): Take a vec_info. * tree-vect-stmts.c (vect_supportable_shift): Likewise. * tree-vect-patterns.c (vect_synth_mult_by_constant): Update call accordingly. From-SVN: r277224
2019-10-18re PR target/86753 (gcc.target/aarch64/sve/vcond_[45].c fail after recent ↵Prathamesh Kulkarni
combine patch) 2019-10-18 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> Richard Sandiford <richard.sandiford@arm.com> PR target/86753 * tree-vectorizer.h (scalar_cond_masked_key): New struct, and define hashmap traits for it. (loop_vec_info::scalar_cond_masked_set): New member. (vect_record_loop_mask): Adjust prototype. * tree-vectorizer.c (scalar_cond_masked_key::get_cond_ops_from_tree): Implement method. * tree-vect-loop.c (vectorizable_reduction): Pass NULL as last arg to vect_record_loop_mask. (vectorizable_live_operation): Likewise. (vect_record_loop_mask): New param scalar_mask. Add entry cond, loop_mask to scalar_cond_masked_set if scalar_mask is non NULL. * tree-vect-stmts.c (check_load_store_masking): New param scalar_mask. Pass it as last arg to vect_record_loop_mask. (vectorizable_call): Pass scalar_mask as last arg to vect_record_loop_mask. (vectorizable_store): Likewise. (vectorizable_load): Likewise. (vectorizable_condition): Check if another part of vectorized code applies loop_mask to condition or to it's inverse, and if yes, apply loop_mask to result of vector comparison. testsuite/ * gcc.target/aarch64/sve/cond_cnot_2.c: Remove XFAIL from { scan-assembler-not {\tsel\t}. * gcc.target/aarch64/sve/cond_convert_1.c: Adjust to make only one load conditional. * gcc.target/aarch64/sve/cond_convert_4.c: Likewise. * gcc.target/aarch64/sve/cond_unary_2.c: Likewise. * gcc.target/aarch64/sve/vcond_4.c: Remove XFAIL's. * gcc.target/aarch64/sve/vcond_5.c: Likewise. Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com> From-SVN: r277141
2019-10-17tree-vectorizer.h (_stmt_vec_info::cond_reduc_code): Remove.Richard Biener
2019-10-17 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::cond_reduc_code): Remove. (STMT_VINFO_VEC_COND_REDUC_CODE): Likewise. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Do not initialize STMT_VINFO_VEC_COND_REDUC_CODE. * tree-vect-loop.c (vect_is_simple_reduction): Set STMT_VINFO_REDUC_CODE. (vectorizable_reduction): Remove dead and redundant code, use STMT_VINFO_REDUC_CODE instead of STMT_VINFO_VEC_COND_REDUC_CODE. From-SVN: r277126
2019-10-17[vect] Refactor versioning thresholdAndre Vieira
gcc/ChangeLog: 2019-10-17 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-loop.c (vect_transform_loop): Move code from here... * tree-vect-loop-manip.c (vect_loop_versioning): ... to here. * tree-vectorizer.h (vect_loop_versioning): Remove unused parameters. From-SVN: r277101
2019-10-17tree-vect-loop.c (needs_fold_left_reduction_p): Export.Richard Biener
2019-10-17 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (needs_fold_left_reduction_p): Export. (vect_is_simple_reduction): Move all validity checks ... (vectorizable_reduction): ... here. Compute whether we need a fold-left reduction here. * tree-vect-patterns.c (vect_reassociating_reduction_p): Merge both overloads, check needs_fold_left_reduction_p directly. * tree-vectorizer.h (needs_fold_left_reduction_p): Declare. From-SVN: r277100
2019-10-11tree-vect-loop.c (vect_analyze_loop_operations): Adjust call to ↵Bernd Edlinger
vectorizable_live_operation. 2019-10-11 Bernd Edlinger <bernd.edlinger@hotmail.de> * tree-vect-loop.c (vect_analyze_loop_operations): Adjust call to vectorizable_live_operation. (vectorizable_live_operation): Adjust parameters. * tree-vect-stmts.c (vect_init_vector, vect_gen_widened_results_half): Fix typo in function comment. (can_vectorize_live_stmts): Adjust function comment. Adjust parameters. Adjust call to vectorizable_live_operation. (vect_analyze_stmt): Adjust call to can_vectorize_live_stmts. (vect_transform_stmt): Adjust function comment. Adjust call to can_vectorize_live_stmts. * tree-vectorizer.h (vectorizable_live_operation): Adjust parameters. From-SVN: r276886
2019-10-09tree-vectorizer.h (_stmt_vec_info::reduc_vectype_in): New.Richard Biener
2019-10-08 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::reduc_vectype_in): New. (_stmt_vec_info::force_single_cycle): Likewise. (STMT_VINFO_FORCE_SINGLE_CYCLE): New. (STMT_VINFO_REDUC_VECTYPE_IN): Likewise. * tree-vect-loop.c (vectorizable_reduction): Set STMT_VINFO_REDUC_VECTYPE_IN and STMT_VINFO_FORCE_SINGLE_CYCLE. (vect_transform_reduction): Use them to remove redundant code. (vect_transform_cycle_phi): Likewise. From-SVN: r276752
2019-10-08tree-vectorizer.h (_stmt_vec_info::v_reduc_type): Remove.Richard Biener
2019-10-08 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::v_reduc_type): Remove. (_stmt_vec_info::is_reduc_info): Add. (STMT_VINFO_VEC_REDUCTION_TYPE): Remove. (vectorizable_condition): Remove. (vectorizable_shift): Likewise. (vectorizable_reduction): Adjust. (info_for_reduction): New. * tree-vect-loop.c (vect_force_simple_reduction): Fold into... (vect_analyze_scalar_cycles_1): ... here. (vect_analyze_loop_operations): Adjust. (needs_fold_left_reduction_p): Simplify for single caller. (vect_is_simple_reduction): Likewise. Remove stmt restriction for nested cycles not part of double reductions. (vect_model_reduction_cost): Pass in the reduction type. (info_for_reduction): New function. (vect_create_epilog_for_reduction): Use it, access reduction meta off the stmt info it returns. Use STMT_VINFO_REDUC_TYPE instead of STMT_VINFO_VEC_REDUCTION_TYPE. (vectorize_fold_left_reduction): Remove pointless assert. (vectorizable_reduction): Analyze the full reduction when visiting the outermost PHI. Simplify. Use STMT_VINFO_REDUC_TYPE instead of STMT_VINFO_VEC_REDUCTION_TYPE. Direct reduction stmt code-generation to vectorizable_* in most cases. Verify code-generation only for cases handled by vect_transform_reductuon. (vect_transform_reduction): Use info_for_reduction to get at reduction meta. Simplify. (vect_transform_cycle_phi): Likewise. (vectorizable_live_operation): Likewise. * tree-vect-patterns.c (vect_reassociating_reduction_p): Look at the PHI node for STMT_VINFO_REDUC_TYPE. * tree-vect-slp.c (vect_schedule_slp_instance): Remove no longer necessary code. * tree-vect-stmts.c (vectorizable_shift): Make static again. (vectorizable_condition): Likewise. Get at reduction related info via info_for_reduction. (vect_analyze_stmt): Adjust. (vect_transform_stmt): Likewise. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize STMT_VINFO_REDUC_TYPE instead of STMT_VINFO_VEC_REDUCTION_TYPE. * gcc.dg/vect/pr65947-1.c: Adjust. * gcc.dg/vect/pr65947-13.c: Likewise. * gcc.dg/vect/pr65947-14.c: Likewise. * gcc.dg/vect/pr65947-4.c: Likewise. * gcc.dg/vect/pr80631-1.c: Likewise. * gcc.dg/vect/pr80631-2.c: Likewise. From-SVN: r276700
2019-10-02tree-vectorizer.h (vect_transform_reduction): Declare.Richard Biener
2019-10-02 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (vect_transform_reduction): Declare. * tree-vect-stmts.c (vect_transform_stmt): Use it. * tree-vect-loop.c (vectorizable_reduction): Split out reduction stmt transform to ... (vect_transform_reduction): ... this. From-SVN: r276452
2019-10-02tree-vectorizer.h (stmt_vec_info_type::cycle_phi_info_type): New.Richard Biener
2019-10-02 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (stmt_vec_info_type::cycle_phi_info_type): New. (vect_transform_cycle_phi): Declare. * tree-vect-stmts.c (vect_transform_stmt): Call vect_transform_cycle_phi. * tree-vect-loop.c (vectorizable_reduction): Split out PHI transformation stage to ... (vect_transform_cycle_phi): ... here. From-SVN: r276441
2019-09-30gimple.c (gimple_get_lhs): For PHIs return the result.Richard Biener
2019-09-30 Richard Biener <rguenther@suse.de> * gimple.c (gimple_get_lhs): For PHIs return the result. * tree-vectorizer.h (vectorizable_live_operation): Also get the SLP instance as argument. * tree-vect-loop.c (vect_analyze_loop_operations): Also handle double-reduction PHIs with vectorizable_lc_phi. (vect_analyze_loop_operations): Adjust. (vect_create_epilog_for_reduction): Remove all code not dealing with reduction LC PHI or epilogue generation. (vectorizable_live_operation): Call vect_create_epilog_for_reduction for live stmts of reductions. * tree-vect-stmts.c (vectorizable_condition): When !for_reduction do not handle defs that are not vect_internal_def. (can_vectorize_live_stmts): Adjust. (vect_analyze_stmt): When the vectorized stmt defined a value used on backedges adjust the backedge uses of vectorized PHIs. From-SVN: r276299
2019-09-27tree-vectorizer.h (_stmt_vec_info::reduc_fn): New.Richard Biener
2019-09-27 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::reduc_fn): New. (STMT_VINFO_REDUC_FN): Likewise. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize STMT_VINFO_REDUC_FN. * tree-vect-loop.c (vect_is_simple_reduction): Fix STMT_VINFO_REDUC_IDX for condition reductions. (vect_create_epilog_for_reduction): Compute all required state from the stmt to be vectorized. (vectorizable_reduction): Simplify vect_create_epilog_for_reduction invocation and remove then dead code. For single def-use chains record only a single vector stmt. From-SVN: r276180
2019-09-26tree-vect-loop.c (vect_analyze_loop_operations): Analyze loop-closed PHIs ↵Richard Biener
that are vect_internal_def. 2019-09-26 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vect_analyze_loop_operations): Analyze loop-closed PHIs that are vect_internal_def. (vect_create_epilog_for_reduction): Exit early for nested cycles. Simplify. (vectorizable_lc_phi): New. * tree-vect-stmts.c (vect_analyze_stmt): Call vectorize_lc_phi. (vect_transform_stmt): Likewise. * tree-vectorizer.h (stmt_vec_info_type): Add lc_phi_info_type. (vectorizable_lc_phi): Declare. From-SVN: r276157
2019-09-26tree-vect-loop.c (vect_analyze_loop_operations): Also call ↵Richard Biener
vectorizable_reduction for vect_double_reduction_def. 2019-09-26 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (vect_analyze_loop_operations): Also call vectorizable_reduction for vect_double_reduction_def. (vect_transform_loop): Likewise. (vect_create_epilog_for_reduction): Move double-reduction PHI creation and preheader argument setting of PHIs ... (vectorizable_reduction): ... here. Also process vect_double_reduction_def PHIs, creating the vectorized PHI nodes, remembering the scalar adjustment computed for the epilogue in STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT. Remember the original reduction code in STMT_VINFO_REDUC_CODE. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize STMT_VINFO_REDUC_CODE. * tree-vectorizer.h (_stmt_vec_info::reduc_epilogue_adjustment): New. (_stmt_vec_info::reduc_code): Likewise. (STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT): Likewise. (STMT_VINFO_REDUC_CODE): Likewise. From-SVN: r276150
2019-09-24tree-vectorizer.h (_stmt_vec_info::const_cond_reduc_code): Rename to...Richard Biener
2019-09-24 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::const_cond_reduc_code): Rename to... (_stmt_vec_info::cond_reduc_code): ... this. (_stmt_vec_info::induc_cond_initial_val): Add. (STMT_VINFO_VEC_CONST_COND_REDUC_CODE): Rename to... (STMT_VINFO_VEC_COND_REDUC_CODE): ... this. (STMT_VINFO_VEC_INDUC_COND_INITIAL_VAL): Add. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Adjust. * tree-vect-loop.c (get_initial_def_for_reduction): Pass in the reduction code. (vect_create_epilog_for_reduction): Drop special induction condition reduction params, pass in reduction code and simplify. (vectorizable_reduction): Perform condition reduction kind selection only at analysis time. Adjust passing on state. From-SVN: r276099
2019-09-20re PR testsuite/91821 (r275928 breaks gcc.target/powerpc/sad-vectorize-2.c)Richard Biener
2019-09-20 Richard Biener <rguenther@suse.de> PR tree-optimization/91821 * tree-vect-loop.c (check_reduction_path): Check we can compute reduc_idx. (vect_is_simple_reduction): Set STMT_VINFO_REDUC_IDX. * tree-vect-patterns.c (vect_reassociating_reduction_p): Return operands in canonical order. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize STMT_VINFO_REDUC_IDX. * tree-vectorizer.h (_stmt_vec_info::reduc_idx): New. (STMT_VINFO_REDUC_IDX): Likewise. From-SVN: r275996