@@ -12549,6 +12549,7 @@ the two, as explained in the sections below.
@menu
* Global Register Variables:: Variables declared at global scope.
* Local Register Variables:: Variables declared within a function.
+* Hard Register Constraints:: Operands forced into specific machine registers.
@end menu
@node Global Register Variables
@@ -12754,6 +12755,194 @@ with slightly different characteristics (@pxref{MIPS Coprocessors,,
Defining coprocessor specifics for MIPS targets, gccint,
GNU Compiler Collection (GCC) Internals}).
+@node Hard Register Constraints
+@subsubsection Hard Register Constraints
+
+Similar to register @code{asm} but still distinct, hard register constraints
+are another way to force operands of inline @code{asm} into specific machine
+registers. In contrast to register @code{asm} where a variable is bound to a
+machine register, a hard register constraint loads an @code{asm} operand into a
+machine register. Assume in the following that @code{r4} is a general-purpose
+register, @code{f5} a floating-point register, and @code{v6} a vector register
+for some target.
+
+@smallexample
+int x;
+int y __attribute__ ((vector_size (16)));
+@dots{}
+asm ("some instructions"
+ : "=@{r4@}" (x)
+ : "@{f5@}" (42.0), "@{v6@}" (y));
+@end smallexample
+
+For the inline @code{asm}, variable @code{x} is loaded into register @code{r4},
+and @code{y} into @code{v6}. Furthermore, constant @code{42.0} is loaded into
+floating-point register @code{f5}.
+
+A key difference between register @code{asm} and hard register constraints is
+that the latter are specified at the point where they are supposed to manifest,
+namely at inline @code{asm}, which may lead to more readable code.
+
+@subsubheading Usage
+
+Each input operand is loaded into the register specified by its corresponding
+hard register constraint. See @ref{Discussion Same Register} for some open
+questions regarding input operands.
+
+Similar as to register @code{asm}, two outputs must not share a register. That
+means, the following example is invalid
+
+@smallexample
+int x, y;
+asm ("" : "=@{r4@}" (x), "=@{r4@}" (y));
+@end smallexample
+
+See @ref{Discussion Same lvalue} for some open questions regarding output
+operands.
+
+The type of an operand must be supported by the corresponding machine register.
+
+A hard register constraint may refer to any general, floating-point, or vector
+register except a fixed register as e.g.@: the stack pointer register. The set
+of allowed registers is target dependent similar as for register @code{asm}.
+Furthermore, the referenced register must be a valid register name of the
+target. Note, on some targets, a single register may be referred to by
+different names where each name specifies the length of the register. For
+example, on x86_64 the register names @code{rcx}, @code{ecx}, and @code{cx} all
+refer to the same register but in different sizes. If any of those names is
+used for a hard register constraint, the actual size of a register is
+determined by its corresponding operand. For example
+
+@smallexample
+long x;
+asm ("mov\t$42, %0" : "=@{ecx@}" (x));
+@end smallexample
+
+Although the hard register constraint refers to register @code{ecx}, the actual
+register will be @code{rcx} since on x86_64 a @code{long} is 8 byte in total.
+This aligns with register @code{asm} where you could have
+
+@smallexample
+register long x asm ("ecx");
+@end smallexample
+
+It is debatable whether we should allow this or not.
+
+@subsubheading Interaction with Register @code{asm}
+
+A mixture of both constructs as for example
+
+@smallexample
+register int x asm ("r4") = 42;
+int y;
+asm ("" : "=@{r5@}" (y) : "r" (x));
+@end smallexample
+
+is valid.
+
+If an operand is a register @code{asm} and the corresponding constraint a hard
+register, then both must refer to the same register. That means
+
+@smallexample
+register int x asm ("r4");
+asm ("" : "=@{r4@}" (x));
+@end smallexample
+
+is valid and
+
+@smallexample
+register int x asm ("r4");
+asm ("" : "=@{r5@}" (x));
+@end smallexample
+
+is invalid.
+
+@subsubheading Limitations
+
+As mentioned above, at the moment fixed registers are not supported for hard
+register constraints. Thus, idioms like
+
+@smallexample
+register void *x asm ("rsp");
+asm ("" : "=r" (x));
+@end smallexample
+
+are not supported for hard register constraints. This might be lifted.
+
+@anchor{Discussion Same Register}
+@subsubheading Discussion: Same Register
+
+In an ordinary use case a hard register constraint for inputs is used at most
+once among an alternative. However, compared to register @code{asm} which
+allows constructs like
+
+@smallexample
+register int x asm ("0") = 42;
+asm ("" : "=r" (x) : "0" (x), "r" (x));
+@end smallexample
+
+or even
+
+@smallexample
+register int x asm ("0") = 42;
+register int y asm ("0") = 24;
+asm ("" : "=r" (x) : "r" (x), "r" (y));
+@end smallexample
+
+where multiple input operands share the same register, it is debatable whether
+the analogue for hard register constraints should be allowed or not. One could
+argue that if multiple inputs share the same register and are also provable
+equal, then such constructs could be allowed as e.g. in
+
+@smallexample
+asm ("" : "=@{r4@}" (x) : "0" (x), "@{r4@}" (x), "@{r4@}" (x));
+@end smallexample
+
+assuming that @code{x} is an lvalue. Of course, in case equality cannot be
+shown or where it is trivially unequal, such programs must be rejected.
+
+For the sake of simplicity and hopefully less errors, I tend to generally
+disallow multiple inputs sharing the same hard register constraint among an
+alternative. Another argument against it is that showing equality highly
+depends on optimizations and therefore accepting/rejecting a program then also
+depends on it.
+
+@anchor{Discussion Same lvalue}
+@subsubheading Discussion: Same lvalue
+
+An inline @code{asm} as in the following is accepted
+
+@smallexample
+int x;
+asm ("" : "=r" (x), "=r" (x));
+@end smallexample
+
+where two distinct registers are allocated for a single object. However,
+referring to the same object multiple times in output operands seems error
+prone to me and I would rather like to reject such programs. Currently, in
+order to follow the established rules, the following is allowed
+
+@smallexample
+int x;
+asm ("" : "=@{r4@}" (x), "=@{r5@}" (x));
+@end smallexample
+
+Are there any reasons why such programs should be accepted?
+
+@subsubheading Discussion: Unions and Multiple Alternatives
+
+With the current implementation unions and multiple alternatives are supported.
+For example
+
+@smallexample
+asm ("" : "=@{r1@}@{r2@}@{r3@},m@{r4@}" (x) : "@{r4@},r" (y), "@{r5@},@{r4@}" (z));
+@end smallexample
+
+Note, both inputs make use of @code{r4} but in different alternatives which is
+fine. Since LRA does all the hard work, this works so far TM. However, I'm
+not entirely sure whether allowing alternatives or even unions is a thing we
+really want to support.
+
@node Size of an asm
@subsection Size of an @code{asm}
@@ -1366,6 +1366,12 @@ as for @samp{<} apply.
A register operand is allowed provided that it is in a general
register.
+@cindex hard registers in constraint
+@item @samp{@{r@}}
+An operand is bound to hard register @samp{r} which may be any general,
+floating-point, or vector register except a fixed register like a stack pointer
+register. The set of fixed registers is target dependent.
+
@cindex constants in constraints
@cindex @samp{i} in constraint
@item @samp{i}
@@ -6974,6 +6974,115 @@ match_asm_constraints_1 (rtx_insn *insn, rtx *p_sets, int noutputs)
df_insn_rescan (insn);
}
+/* It is expected and desired that optimizations coalesce multiple pseudos into
+ one whenever possible. However, in case of hard register constraints we may
+ have to undo this and introduce copies since otherwise we could constraint a
+ single pseudo to different hard registers. For example, during register
+ allocation the following insn would be unsatisfiable since pseudo 60 is
+ constrained to hard register r5 and r6 at the same time.
+
+ (insn 7 5 0 2 (asm_operands/v ("foo") ("") 0 [
+ (reg:DI 60) repeated x2
+ ]
+ [
+ (asm_input:DI ("{r5}") t.c:4)
+ (asm_input:DI ("{r6}") t.c:4)
+ ]
+ [] t.c:4) "t.c":4:3 -1
+ (expr_list:REG_DEAD (reg:DI 60)
+ (nil)))
+
+ Therefore, introduce a copy of pseudo 60 and transform it into
+
+ (insn 10 5 7 2 (set (reg:DI 62)
+ (reg:DI 60)) "t.c":4:3 1503 {*movdi_64}
+ (nil))
+ (insn 7 10 11 2 (asm_operands/v ("foo") ("") 0 [
+ (reg:DI 60)
+ (reg:DI 62)
+ ]
+ [
+ (asm_input:DI ("{r5}") t.c:4)
+ (asm_input:DI ("{r6}") t.c:4)
+ ]
+ [] t.c:4) "t.c":4:3 -1
+ (expr_list:REG_DEAD (reg:DI 62)
+ (expr_list:REG_DEAD (reg:DI 60)
+ (nil))))
+
+ Now, LRA can assign pseudo 60 to r5, and pseudo 62 to r6.
+
+ TODO: The current implementation is conservative and we could do a bit
+ better in case of alternatives. For example
+
+ (insn 7 5 0 2 (asm_operands/v ("foo") ("") 0 [
+ (reg:DI 60) repeated x2
+ ]
+ [
+ (asm_input:DI ("r,{r5}") t.c:4)
+ (asm_input:DI ("{r6},r") t.c:4)
+ ]
+ [] t.c:4) "t.c":4:3 -1
+ (expr_list:REG_DEAD (reg:DI 60)
+ (nil)))
+
+ For this insn we wouldn't need to come up with a copy of pseudo 60 since in
+ each alternative pseudo 60 is constrained exactly one time. */
+
+static void
+match_asm_constraints_2 (rtx_insn *insn, rtx pat)
+{
+ rtx op;
+ if (GET_CODE (pat) == SET && GET_CODE (SET_SRC (pat)) == ASM_OPERANDS)
+ op = SET_SRC (pat);
+ else if (GET_CODE (pat) == ASM_OPERANDS)
+ op = pat;
+ else
+ return;
+ int ninputs = ASM_OPERANDS_INPUT_LENGTH (op);
+ rtvec inputs = ASM_OPERANDS_INPUT_VEC (op);
+ bool changed = false;
+ auto_bitmap constrained_regs;
+
+ for (int i = 0; i < ninputs; ++i)
+ {
+ rtx input = RTVEC_ELT (inputs, i);
+ const char *constraint = ASM_OPERANDS_INPUT_CONSTRAINT (op, i);
+ if ((!REG_P (input) && !SUBREG_P (input))
+ || (REG_P (input) && HARD_REGISTER_P (input))
+ || strchr (constraint, '{') == nullptr)
+ continue;
+ int regno;
+ if (SUBREG_P (input))
+ {
+ if (REG_P (SUBREG_REG (input)))
+ regno = REGNO (SUBREG_REG (input));
+ else
+ continue;
+ }
+ else
+ regno = REGNO (input);
+ /* Keep the first usage of a constrained pseudo as is and only
+ introduce copies for subsequent usages. */
+ if (! bitmap_bit_p (constrained_regs, regno))
+ {
+ bitmap_set_bit (constrained_regs, regno);
+ continue;
+ }
+ rtx tmp = gen_reg_rtx (GET_MODE (input));
+ start_sequence ();
+ emit_move_insn (tmp, input);
+ rtx_insn *insns = get_insns ();
+ end_sequence ();
+ emit_insn_before (insns, insn);
+ RTVEC_ELT (inputs, i) = tmp;
+ changed = true;
+ }
+
+ if (changed)
+ df_insn_rescan (insn);
+}
+
/* Add the decl D to the local_decls list of FUN. */
void
@@ -7030,6 +7139,13 @@ pass_match_asm_constraints::execute (function *fun)
continue;
pat = PATTERN (insn);
+
+ if (GET_CODE (pat) == PARALLEL)
+ for (int i = XVECLEN (pat, 0) - 1; i >= 0; --i)
+ match_asm_constraints_2 (insn, XVECEXP (pat, 0, i));
+ else
+ match_asm_constraints_2 (insn, pat);
+
if (GET_CODE (pat) == PARALLEL)
p_sets = &XVECEXP (pat, 0, 0), noutputs = XVECLEN (pat, 0);
else if (GET_CODE (pat) == SET)
@@ -1284,6 +1284,20 @@ mdep_constraint_len (const char *s, file_location loc, int opno)
if (!strncmp (s, p->name, p->namelen))
return p->namelen;
+ if (*s == '{')
+ {
+ const char *end = s + 1;
+ while (*end != '}' && *end != '"' && *end != '\0')
+ ++end;
+ /* Similarly as in decode_hreg_constraint(), consider any hard register
+ name longer than a few characters as an error. */
+ ptrdiff_t len = end - s;
+ if (*end == '}' && len > 1 && len < 31)
+ {
+ return len + 1;
+ }
+ }
+
error_at (loc, "error: undefined machine-specific constraint "
"at this point: \"%s\"", s);
message_at (loc, "note: in operand %d", opno);
@@ -1148,7 +1148,7 @@ write_insn_constraint_len (void)
unsigned int i;
puts ("static inline size_t\n"
- "insn_constraint_len (char fc, const char *str ATTRIBUTE_UNUSED)\n"
+ "insn_constraint_len (char fc, const char *str)\n"
"{\n"
" switch (fc)\n"
" {");
@@ -1181,6 +1181,8 @@ write_insn_constraint_len (void)
puts (" default: break;\n"
" }\n"
+ " if (str[0] == '{')\n"
+ " return ((const char *) rawmemchr (str + 1, '}') - str) + 1;\n"
" return 1;\n"
"}\n");
}
@@ -2128,6 +2128,82 @@ ira_get_dup_out_num (int op_num, alternative_mask alts,
+/* Return true if a replacement of SRC by DEST does not lead to unsatisfiable
+ asm. Thus, a replacement is valid if and only if SRC and DEST are not
+ constrained in asm inputs of a single asm statement. See
+ match_asm_constraints_2() for more details. TODO: As in
+ match_asm_constraints_2() consider alternatives more precisely. */
+
+static bool
+valid_replacement_for_asm_input_p_1 (const_rtx asmops, const_rtx src, const_rtx dest)
+{
+ int ninputs = ASM_OPERANDS_INPUT_LENGTH (asmops);
+ rtvec inputs = ASM_OPERANDS_INPUT_VEC (asmops);
+ for (int i = 0; i < ninputs; ++i)
+ {
+ rtx input_src = RTVEC_ELT (inputs, i);
+ const char *constraint_src
+ = ASM_OPERANDS_INPUT_CONSTRAINT (asmops, i);
+ if (rtx_equal_p (input_src, src)
+ && strchr (constraint_src, '{') != nullptr)
+ for (int j = 0; j < ninputs; ++j)
+ {
+ rtx input_dest = RTVEC_ELT (inputs, j);
+ const char *constraint_dest
+ = ASM_OPERANDS_INPUT_CONSTRAINT (asmops, j);
+ if (rtx_equal_p (input_dest, dest)
+ && strchr (constraint_dest, '{') != nullptr)
+ return false;
+ }
+ }
+ return true;
+}
+
+static bool
+valid_replacement_for_asm_input_p (const_rtx src, const_rtx dest)
+{
+ /* Bail out early if there is no asm statement. */
+ if (!crtl->has_asm_statement)
+ return true;
+ for (df_ref use = DF_REG_USE_CHAIN (REGNO (src));
+ use;
+ use = DF_REF_NEXT_REG (use))
+ {
+ struct df_insn_info *use_info = DF_REF_INSN_INFO (use);
+ /* Only check real uses, not artificial ones. */
+ if (use_info)
+ {
+ rtx_insn *insn = DF_REF_INSN (use);
+ rtx pat = PATTERN (insn);
+ if (asm_noperands (pat) <= 0)
+ continue;
+ if (GET_CODE (pat) == SET)
+ {
+ if (!valid_replacement_for_asm_input_p_1 (SET_SRC (pat), src, dest))
+ return false;
+ }
+ else if (GET_CODE (pat) == PARALLEL)
+ for (int i = 0, len = XVECLEN (pat, 0); i < len; ++i)
+ {
+ rtx asmops = XVECEXP (pat, 0, i);
+ if (GET_CODE (asmops) == SET)
+ asmops = SET_SRC (asmops);
+ if (GET_CODE (asmops) == ASM_OPERANDS
+ && !valid_replacement_for_asm_input_p_1 (asmops, src, dest))
+ return false;
+ }
+ else if (GET_CODE (pat) == ASM_OPERANDS)
+ {
+ if (!valid_replacement_for_asm_input_p_1 (pat, src, dest))
+ return false;
+ }
+ else
+ gcc_unreachable ();
+ }
+ }
+ return true;
+}
+
/* Search forward to see if the source register of a copy insn dies
before either it or the destination register is modified, but don't
scan past the end of the basic block. If so, we can replace the
@@ -2177,7 +2253,8 @@ decrease_live_ranges_number (void)
auto-inc memory reference, so we must disallow this
optimization on them. */
|| sregno == STACK_POINTER_REGNUM
- || dregno == STACK_POINTER_REGNUM)
+ || dregno == STACK_POINTER_REGNUM
+ || !valid_replacement_for_asm_input_p (src, dest))
continue;
dest_death = NULL_RTX;
@@ -114,6 +114,7 @@
#include "target.h"
#include "rtl.h"
#include "tree.h"
+#include "stmt.h"
#include "predict.h"
#include "df.h"
#include "memmodel.h"
@@ -2165,6 +2166,7 @@ process_alt_operands (int only_alternative)
bool costly_p;
enum reg_class cl;
const HARD_REG_SET *cl_filter;
+ HARD_REG_SET hard_reg_constraint;
/* Calculate some data common for all alternatives to speed up the
function. */
@@ -2536,6 +2538,17 @@ process_alt_operands (int only_alternative)
cl_filter = nullptr;
goto reg;
+ case '{':
+ {
+ int regno = decode_hard_reg_constraint (p);
+ gcc_assert (regno >= 0);
+ cl = REGNO_REG_CLASS (regno);
+ CLEAR_HARD_REG_SET (hard_reg_constraint);
+ SET_HARD_REG_BIT (hard_reg_constraint, regno);
+ cl_filter = &hard_reg_constraint;
+ goto reg;
+ }
+
default:
cn = lookup_constraint (p);
switch (get_constraint_type (cn))
@@ -25,6 +25,7 @@ along with GCC; see the file COPYING3. If not see
#include "target.h"
#include "rtl.h"
#include "tree.h"
+#include "stmt.h"
#include "cfghooks.h"
#include "df.h"
#include "memmodel.h"
@@ -2367,7 +2368,8 @@ asm_operand_ok (rtx op, const char *constraint, const char **constraints)
{
case CT_REGISTER:
if (!result
- && reg_class_for_constraint (cn) != NO_REGS
+ && (reg_class_for_constraint (cn) != NO_REGS
+ || constraint[0] == '{')
&& GET_MODE (op) != BLKmode
&& register_operand (op, VOIDmode))
result = 1;
@@ -3301,6 +3303,13 @@ constrain_operands (int strict, alternative_mask alternatives)
win = true;
break;
+ case '{':
+ if ((REG_P (op) && HARD_REGISTER_P (op)
+ && (int) REGNO (op) == decode_hard_reg_constraint (p))
+ || !reload_completed)
+ win = true;
+ break;
+
default:
{
enum constraint_num cn = lookup_constraint (p);
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see
#include "tree-cfg.h"
#include "dumpfile.h"
#include "builtins.h"
+#include "output.h"
/* Functions and data structures for expanding case statements. */
@@ -174,6 +175,32 @@ expand_label (tree label)
maybe_set_first_label_num (label_r);
}
+/* Parse a hard register constraint and return its number or -1 in case of an
+ error. BEGIN should point to a string of the form `{regname}`. For the
+ sake of simplicity assume that a register name is not longer than 31
+ characters, if not error out. */
+
+int
+decode_hard_reg_constraint (const char *begin)
+{
+ if (*begin != '{')
+ return -1;
+ ++begin;
+ const char *end = begin;
+ while (*end != '}' && *end != '\0')
+ ++end;
+ if (*end != '}' || end == begin)
+ return -1;
+ ptrdiff_t len = end - begin;
+ if (len >= 31)
+ return -1;
+ char regname[32];
+ memcpy (regname, begin, len);
+ regname[len] = '\0';
+ int regno = decode_reg_name (regname);
+ return regno;
+}
+
/* Parse the output constraint pointed to by *CONSTRAINT_P. It is the
OPERAND_NUMth output operand, indexed from zero. There are NINPUTS
inputs and NOUTPUTS outputs to this extended-asm. Upon return,
@@ -289,6 +316,12 @@ parse_output_constraint (const char **constraint_p, int operand_num,
*allows_mem = true;
break;
+ case '{':
+ {
+ *allows_reg = true;
+ break;
+ }
+
default:
if (!ISALPHA (*p))
break;
@@ -408,6 +441,12 @@ parse_input_constraint (const char **constraint_p, int input_num,
*allows_mem = true;
break;
+ case '{':
+ {
+ *allows_reg = true;
+ break;
+ }
+
default:
if (! ISALPHA (constraint[j]))
{
@@ -25,6 +25,7 @@ extern bool parse_output_constraint (const char **, int, int, int,
bool *, bool *, bool *);
extern bool parse_input_constraint (const char **, int, int, int, int,
const char * const *, bool *, bool *);
+extern int decode_hard_reg_constraint (const char *);
extern tree resolve_asm_operand_names (tree, tree, tree, tree);
#ifdef HARD_CONST
/* Silly ifdef to avoid having all includers depend on hard-reg-set.h. */
new file mode 100644
@@ -0,0 +1,85 @@
+/* { dg-do compile { target aarch64*-*-* arm*-*-* i?86-*-* powerpc*-*-* riscv*-*-* s390*-*-* x86_64-*-* } } */
+
+#if defined (__aarch64__)
+# define GPR "{x4}"
+/* { dg-final { scan-assembler-times "foo\tx4" 8 { target { aarch64*-*-* } } } } */
+#elif defined (__arm__)
+# define GPR "{r4}"
+/* { dg-final { scan-assembler-times "foo\tr4" 8 { target { arm*-*-* } } } } */
+#elif defined (__i386__)
+# define GPR "{ecx}"
+/* { dg-final { scan-assembler-times "foo\t%cl" 2 { target { i?86-*-* } } } } */
+/* { dg-final { scan-assembler-times "foo\t%cx" 2 { target { i?86-*-* } } } } */
+/* { dg-final { scan-assembler-times "foo\t%ecx" 4 { target { i?86-*-* } } } } */
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define GPR "{r5}"
+/* { dg-final { scan-assembler-times "foo\t5" 8 { target { powerpc*-*-* } } } } */
+#elif defined (__riscv)
+# define GPR "{t5}"
+/* { dg-final { scan-assembler-times "foo\tt5" 8 { target { riscv*-*-* } } } } */
+#elif defined (__s390__)
+# define GPR "{r4}"
+/* { dg-final { scan-assembler-times "foo\t%r4" 8 { target { s390*-*-* } } } } */
+#elif defined (__x86_64__)
+# define GPR "{rcx}"
+/* { dg-final { scan-assembler-times "foo\t%cl" 2 { target { x86_64-*-* } } } } */
+/* { dg-final { scan-assembler-times "foo\t%cx" 2 { target { x86_64-*-* } } } } */
+/* { dg-final { scan-assembler-times "foo\t%ecx" 2 { target { x86_64-*-* } } } } */
+/* { dg-final { scan-assembler-times "foo\t%rcx" 2 { target { x86_64-*-* } } } } */
+#endif
+
+char
+test_char (char x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (x));
+ return x;
+}
+
+char
+test_char_from_mem (char *x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (*x));
+ return *x;
+}
+
+short
+test_short (short x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (x));
+ return x;
+}
+
+short
+test_short_from_mem (short *x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (*x));
+ return *x;
+}
+
+int
+test_int (int x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (x));
+ return x;
+}
+
+int
+test_int_from_mem (int *x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (*x));
+ return *x;
+}
+
+long
+test_long (long x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (x));
+ return x;
+}
+
+long
+test_long_from_mem (long *x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (*x));
+ return *x;
+}
new file mode 100644
@@ -0,0 +1,33 @@
+/* { dg-do compile { target aarch64*-*-* powerpc64*-*-* riscv64-*-* s390*-*-* x86_64-*-* } } */
+/* { dg-options "-std=c99" } we need long long */
+
+#if defined (__aarch64__)
+# define GPR "{x4}"
+/* { dg-final { scan-assembler-times "foo\tx4" 2 { target { aarch64*-*-* } } } } */
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define GPR "{r5}"
+/* { dg-final { scan-assembler-times "foo\t5" 2 { target { powerpc64*-*-* } } } } */
+#elif defined (__riscv)
+# define GPR "{t5}"
+/* { dg-final { scan-assembler-times "foo\tt5" 2 { target { riscv64-*-* } } } } */
+#elif defined (__s390__)
+# define GPR "{r4}"
+/* { dg-final { scan-assembler-times "foo\t%r4" 2 { target { s390*-*-* } } } } */
+#elif defined (__x86_64__)
+# define GPR "{rcx}"
+/* { dg-final { scan-assembler-times "foo\t%rcx" 2 { target { x86_64-*-* } } } } */
+#endif
+
+long long
+test_longlong (long long x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (x));
+ return x;
+}
+
+long long
+test_longlong_from_mem (long long *x)
+{
+ __asm__ ("foo\t%0" : "+"GPR (*x));
+ return *x;
+}
new file mode 100644
@@ -0,0 +1,25 @@
+/* { dg-do compile { target { { aarch64*-*-* powerpc64*-*-* riscv64-*-* s390*-*-* x86_64-*-* } && int128 } } } */
+/* { dg-options "-O2" } get rid of -ansi since we use __int128 */
+
+#if defined (__aarch64__)
+# define REG "{x4}"
+/* { dg-final { scan-assembler-times "foo\tx4" 1 { target { aarch64*-*-* } } } } */
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define REG "{r5}"
+/* { dg-final { scan-assembler-times "foo\t5" 1 { target { powerpc*-*-* } } } } */
+#elif defined (__riscv)
+# define REG "{t5}"
+/* { dg-final { scan-assembler-times "foo\tt5" 1 { target { riscv*-*-* } } } } */
+#elif defined (__s390__)
+# define REG "{r4}"
+/* { dg-final { scan-assembler-times "foo\t%r4" 1 { target { s390*-*-* } } } } */
+#elif defined (__x86_64__)
+# define REG "{xmm0}"
+/* { dg-final { scan-assembler-times "foo\t%xmm0" 1 { target { x86_64-*-* } } } } */
+#endif
+
+void
+test (void)
+{
+ __asm__ ("foo\t%0" :: REG ((__int128) 42));
+}
new file mode 100644
@@ -0,0 +1,50 @@
+/* { dg-do compile { target aarch64*-*-* arm*-*-* powerpc*-*-* riscv*-*-* s390*-*-* x86_64-*-* } } */
+
+#if defined (__aarch64__)
+# define FPR "{d5}"
+/* { dg-final { scan-assembler-times "foo\tv5" 4 { target { aarch64*-*-* } } } } */
+#elif defined (__arm__)
+# define FPR "{d5}"
+/* { dg-additional-options "-march=armv7-a+fp -mfloat-abi=hard" { target arm*-*-* } } */
+/* { dg-final { scan-assembler-times "foo\ts10" 4 { target { arm*-*-* } } } } */
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define FPR "{5}"
+/* { dg-final { scan-assembler-times "foo\t5" 4 { target { powerpc*-*-* } } } } */
+#elif defined (__riscv)
+# define FPR "{f5}"
+/* { dg-final { scan-assembler-times "foo\tf5" 4 { target { rsicv*-*-* } } } } */
+#elif defined (__s390__)
+# define FPR "{f5}"
+/* { dg-final { scan-assembler-times "foo\t%f5" 4 { target { s390*-*-* } } } } */
+#elif defined (__x86_64__)
+# define FPR "{xmm5}"
+/* { dg-final { scan-assembler-times "foo\t%xmm5" 4 { target { x86_64-*-* } } } } */
+#endif
+
+float
+test_float (float x)
+{
+ __asm__ ("foo\t%0" : "+"FPR (x));
+ return x;
+}
+
+float
+test_float_from_mem (float *x)
+{
+ __asm__ ("foo\t%0" : "+"FPR (*x));
+ return *x;
+}
+
+double
+test_double (double x)
+{
+ __asm__ ("foo\t%0" : "+"FPR (x));
+ return x;
+}
+
+double
+test_double_from_mem (double *x)
+{
+ __asm__ ("foo\t%0" : "+"FPR (*x));
+ return *x;
+}
new file mode 100644
@@ -0,0 +1,36 @@
+/* { dg-do compile { target aarch64*-*-* powerpc64*-*-* riscv64-*-* s390*-*-* x86_64-*-* } } */
+
+typedef int V __attribute__ ((vector_size (4 * sizeof (int))));
+
+#if defined (__aarch64__)
+# define VR "{v20}"
+/* { dg-final { scan-assembler-times "foo\tv20" 2 { target { aarch64*-*-* } } } } */
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define VR "{v5}"
+/* { dg-final { scan-assembler-times "foo\t5" 2 { target { powerpc64*-*-* } } } } */
+#elif defined (__riscv)
+# define VR "{v5}"
+/* { dg-additional-options "-march=rv64imv" { target riscv64-*-* } } */
+/* { dg-final { scan-assembler-times "foo\tv5" 2 { target { riscv*-*-* } } } } */
+#elif defined (__s390__)
+# define VR "{v5}"
+/* { dg-require-effective-target s390_mvx { target s390*-*-* } } */
+/* { dg-final { scan-assembler-times "foo\t%v5" 2 { target s390*-*-* } } } */
+#elif defined (__x86_64__)
+# define VR "{xmm9}"
+/* { dg-final { scan-assembler-times "foo\t%xmm9" 2 { target { x86_64-*-* } } } } */
+#endif
+
+V
+test (V x)
+{
+ __asm__ ("foo\t%0" : "+"VR (x));
+ return x;
+}
+
+V
+test_from_mem (V *x)
+{
+ __asm__ ("foo\t%0" : "+"VR (*x));
+ return *x;
+}
new file mode 100644
@@ -0,0 +1,60 @@
+/* { dg-do compile { target aarch64*-*-* arm*-*-* i?86-*-* powerpc*-*-* riscv*-*-* s390*-*-* x86_64-*-* } } */
+/* { dg-options "-O2" } */
+
+/* Test multiple alternatives. */
+
+#if defined (__aarch64__)
+# define GPR1 "{x1}"
+# define GPR2 "{x2}"
+# define GPR3 "{x3}"
+/* { dg-final { scan-assembler-times "foo\tx1,x3" 1 { target { aarch64*-*-* } } } } */
+/* { dg-final { scan-assembler-times "bar\tx2,\\\[x1\\\]" 1 { target { aarch64*-*-* } } } } */
+#elif defined (__arm__)
+# define GPR1 "{r1}"
+# define GPR2 "{r2}"
+# define GPR3 "{r3}"
+/* { dg-final { scan-assembler-times "foo\tr1,r3" 1 { target { arm*-*-* } } } } */
+/* { dg-final { scan-assembler-times "bar\tr2,\\\[r1\\\]" 1 { target { arm*-*-* } } } } */
+#elif defined (__i386__)
+# define GPR1 "{eax}"
+# define GPR2 "{ebx}"
+# define GPR3 "{ecx}"
+/* { dg-final { scan-assembler-times "foo\t4\\(%esp\\),%ecx" 1 { target { i?86-*-* } } } } */
+/* { dg-final { scan-assembler-times "bar\t%ebx,\\(%eax\\)" 1 { target { i?86-*-* } } } } */
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define GPR1 "{r4}"
+# define GPR2 "{r5}"
+# define GPR3 "{r6}"
+/* { dg-final { scan-assembler-times "foo\t4,6" 1 { target { powerpc*-*-* } } } } */
+/* { dg-final { scan-assembler-times "bar\t5,0\\(4\\)" 1 { target { powerpc*-*-* } } } } */
+#elif defined (__riscv)
+# define GPR1 "{t1}"
+# define GPR2 "{t2}"
+# define GPR3 "{t3}"
+/* { dg-final { scan-assembler-times "foo\tt1,t3" 1 { target { riscv*-*-* } } } } */
+/* { dg-final { scan-assembler-times "bar\tt2,0\\(a1\\)" 1 { target { riscv*-*-* } } } } */
+#elif defined (__s390__)
+# define GPR1 "{r0}"
+# define GPR2 "{r1}"
+# define GPR3 "{r2}"
+/* { dg-final { scan-assembler-times "foo\t%r0,%r2" 1 { target { s390*-*-* } } } } */
+/* { dg-final { scan-assembler-times "bar\t%r1,0\\(%r3\\)" 1 { target { s390*-*-* } } } } */
+#elif defined (__x86_64__)
+# define GPR1 "{eax}"
+# define GPR2 "{ebx}"
+# define GPR3 "{rcx}"
+/* { dg-final { scan-assembler-times "foo\t%eax,%rcx" 1 { target { x86_64-*-* } } } } */
+/* { dg-final { scan-assembler-times "bar\t%ebx,\\(%rsi\\)" 1 { target { x86_64-*-* } } } } */
+#endif
+
+void
+test_reg_reg (int x, long long *y)
+{
+ __asm__ ("foo\t%0,%1" :: GPR1"m,"GPR2 (x), GPR3",m" (y));
+}
+
+void
+test_reg_mem (int x, long long *y)
+{
+ __asm__ ("bar\t%0,%1" :: GPR1"m,"GPR2 (x), GPR3",m" (*y));
+}
new file mode 100644
@@ -0,0 +1,41 @@
+/* { dg-do compile { target aarch64*-*-* arm*-*-* i?86-*-* powerpc*-*-* riscv*-*-* s390*-*-* x86_64-*-* } } */
+/* { dg-options "-O2" } */
+
+/* Test multiple alternatives. */
+
+#if defined (__aarch64__)
+# define GPR "{x1}"
+/* { dg-final { scan-assembler-times "foo\tx1,x1" 2 { target { aarch64*-*-* } } } } */
+#elif defined (__arm__)
+# define GPR "{r1}"
+/* { dg-final { scan-assembler-times "foo\tr1,r1" 2 { target { arm*-*-* } } } } */
+#elif defined (__i386__)
+# define GPR "{eax}"
+/* { dg-final { scan-assembler-times "foo\t%eax,%eax" 2 { target { i?86-*-* } } } } */
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define GPR "{r4}"
+/* { dg-final { scan-assembler-times "foo\t4,4" 2 { target { powerpc*-*-* } } } } */
+#elif defined (__riscv)
+# define GPR "{t1}"
+/* { dg-final { scan-assembler-times "foo\tt1,t1" 2 { target { riscv*-*-* } } } } */
+#elif defined (__s390__)
+# define GPR "{r0}"
+/* { dg-final { scan-assembler-times "foo\t%r0,%r0" 2 { target { s390*-*-* } } } } */
+#elif defined (__x86_64__)
+# define GPR "{eax}"
+/* { dg-final { scan-assembler-times "foo\t%eax,%eax" 2 { target { x86_64-*-* } } } } */
+#endif
+
+int
+test_1 (int x)
+{
+ __asm__ ("foo\t%0,%0" : "+"GPR (x));
+ return x;
+}
+
+int
+test_2 (int x, int y)
+{
+ __asm__ ("foo\t%0,%1" : "="GPR (x) : GPR (y));
+ return x;
+}
new file mode 100644
@@ -0,0 +1,49 @@
+/* { dg-do compile { target aarch64*-*-* arm*-*-* i?86-*-* powerpc*-*-* riscv*-*-* s390*-*-* x86_64-*-* } } */
+
+/* Due to hard register constraints, X must be copied. */
+
+#if defined (__aarch64__)
+# define GPR1 "{x1}"
+# define GPR2 "{x2}"
+#elif defined (__arm__)
+# define GPR1 "{r1}"
+# define GPR2 "{r2}"
+#elif defined (__i386__)
+# define GPR1 "{eax}"
+# define GPR2 "{ebx}"
+#elif defined (__powerpc__) || defined (__POWERPC__)
+# define GPR1 "{r4}"
+# define GPR2 "{r5}"
+#elif defined (__riscv)
+# define GPR1 "{t1}"
+# define GPR2 "{t2}"
+#elif defined (__s390__)
+# define GPR1 "{r0}"
+# define GPR2 "{r1}"
+#elif defined (__x86_64__)
+# define GPR1 "{eax}"
+# define GPR2 "{ebx}"
+#endif
+
+#define TEST(T) \
+int \
+test_##T (T x) \
+{ \
+ int out; \
+ __asm__ ("foo" : "=r" (out) : GPR1 (x), GPR2 (x)); \
+ return out; \
+}
+
+TEST(char)
+TEST(short)
+TEST(int)
+TEST(long)
+
+int
+test_subreg (long x)
+{
+ int out;
+ short subreg_x = x;
+ __asm__ ("foo" : "=r" (out) : GPR1 (x), GPR2 (subreg_x));
+ return out;
+}